DOA! D S$ J2 r( `# A- j/ Y
声源定位方法一般可分为三类,一种是基于TDOA的两步算法(two-stage algorithm),一种是基于空间谱估计如MUSIC等,还有就是基于beamforming的方法,也就是这里要介绍的可控波束响应(steered-response power),
. N3 q3 m" R/ S0 ?3 u4 Q5 c
) K4 [6 Q) y1 r: U- Nsteered-response power5 L$ V& Z5 S1 e+ I' e
可控波束响应是利用波束形成(beamforming)的方法,对空间不同方向的声音进行增强,得到声音信号最强的方向就被认为是声源的方向。 5 I' p% K% L' C. A& w1 `
上一篇中简单介绍了麦克风阵列的背景知识,最简单的SRP就是利用延时-累加(delay-and-sum)的方法,寻找输出能量最大的方向。
I% U9 z- B/ W! P+ z 其中,语音信号为宽带信号,因此需要做宽带波束形成,这里我们在频域实现! A/ l4 p8 ?9 A9 {8 P. w
) D9 v0 t3 `* `- }+ T频域宽带波束形成
. h3 X, }( f% w J 频域宽带波束形成可以归类为DFT波束形成器,结构如下图
7 H2 f2 g9 D4 _' G% X G & q; ?. C" P! h/ x2 U
) u8 P A8 B6 A) E
频域处理也可以看做是子带处理(subband),DFT和IDFT的系数分别对应子带处理中的分析综合滤波器组,关于这一种解释,可参考《传感器阵列波束优化设计与应用》第六章。 频域宽带延时累加波束形成的基本过程就是信号分帧加窗->DFT->各频点相位补偿->IDFT
* ?9 ]+ u" a; F2 f+ {3 r' V代码实现如下 s9 y* A7 ^1 T; h) M% ]
nction [ DS, x1] = DelaySumURA( x,fs,N,frameLength,inc,r,angle)
& _ J3 A9 }5 G%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%! d n% S! Z' ?( m
%frequency-domain delay-sum beamformer using circular array8 S }* G, B9 o0 W
%
" e) u* m' H! C- z% input :2 T. R2 N) \ r3 Q
% x : input signal ,samples * channel
) p* X5 `! F/ s. H7 x. d% fs: sample rate+ X9 T/ M7 U. ~4 o9 ^
% N : fft length,frequency bin number( P) g, ^* |0 Z0 f+ Z6 j) M
%frameLength : frame length,usually same as N
3 R6 ~/ e' R. \9 v8 u+ s% inc : step increment+ b: M* v0 k" D
% r : array element radius( s, ?) j% U3 h& {
% angle : incident angle
3 x# X; q2 j& _%' W2 H* \8 C0 s# {
% output :% _; i% [+ o+ ~) r. f
% DS : delay-sum output! Y2 S* }1 q$ C$ J! t* d3 o. N
% x1 : presteered signal,same size as x( g3 m- F1 m6 C; L/ a8 L
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
3 p+ V8 g: M+ d* u$ p( P8 b/ i K P1 c5 m* E' f4 P G7 _
c = 340;
; o1 ?: i* d% A- j3 Y$ HNele = size(x,2);
! F, H4 L* J. I) J" ~omega = zeros(frameLength,1);
( F3 D' U. r3 y2 KH = ones(N/2+1,Nele);* H: D" [3 ? ]3 o
: T3 r4 n. U- y7 U8 Ctheta = 90*pi/180; %固定一个俯仰角1 x1 a; O$ y! X5 w
gamma = [30 90 150 210 270 330]*pi/180;%麦克风位置
8 l) V* ^) b: W- T1 @+ @1 `tao = r*sin(theta)*cos(angle(1)-gamma)/c; %方位角 0 < angle <360% c2 G: D" A& x- K5 d; l
yds = zeros(length(x(:,1)),1);! L0 N- s0 f3 V" W
x1 = zeros(size(x));7 I& r0 c: ^# Z( T0 _
' S6 s. f0 k& A) W* i8 y% frequency bin weights6 K- j/ S, }. E3 Q, k
% for k = 2:1:N/2+1/ c* V5 |/ ~. Q! |, u
for k = 1:1:5000*N/fs
' a1 K. `' D) z+ V8 [ omega(k) = 2*pi*(k-1)*fs/N; / h; }3 S$ `3 F7 g# }! m1 h4 R
% steering vector
8 p. d) S0 A( A( [+ P+ k' p9 a6 _ H(k, = exp(-1j*omega(k)*tao);
2 h- ?! {' }1 e" U9 nend
- ?, Q2 N# `0 N
1 O% _( K; r/ Q+ N9 wfor i = 1:inc:length(x(:,1))-frameLength, v/ F9 M' k' i3 d( I/ n6 B' t
( A: x9 ]- Z1 F5 G1 L5 f d = fft(bsxfun(@times, x(i:i+frameLength-1, ,hamming(frameLength)));, ^- F% \( g' F0 ^! ?: z: r x
1 J1 m5 A" @3 e! }/ A
x_fft=bsxfun(@times, d(1:N/2+1, ,H);
4 s" L+ p' o* R- H3 V4 b# j2 |8 L& s0 H: [% ^: d
% phase transformed2 S' b0 d0 q# z: h3 K/ H
%x_fft = bsxfun(@rdivide, x_fft,abs(d(1:N/2+1, ));
4 ~3 I7 }6 X- A. A( |4 @7 T yf = sum(x_fft,2);
' k7 x. g3 z6 q6 x$ v Cf = [yf;conj(flipud(yf(2:N/2)))];+ t+ m8 S1 I1 q5 V a
; k7 q% @% `8 t9 E# x' l/ ~2 Y2 i. x % 恢复延时累加的信号
: g3 N3 [) v3 w yds(i:i+frameLength-1) = yds(i:i+frameLength-1)+(ifft(Cf));: M, Y" y4 R, Q+ N4 V. n% h
, l& ~& L/ g0 y3 C0 f6 z% ?
% 恢复各路对齐后的信号
! a# n8 D4 M0 c N. a xf = [x_fft;conj(flipud(x_fft(2:N/2, ))];
3 p0 E: E1 F/ x5 n x1(i:i+frameLength-1, = x1(i:i+frameLength-1, +(ifft(xf));+ [* x Q2 ?) Z4 r
end- h/ `) ^& A' \, o
DS = yds/Nele;
3 \+ ]- l; C8 w2 c
" I& K, ?' y, r+ Vend- O* e/ \0 `: a! L; N
然后遍历各个角度重复调用这个函数,测试实际录音数据,代码如下
3 \! Q# Y5 d4 x7 P1 f* Y0 i$ ~. H" t7 |) ^* f0 x- C: ]
%% SRP Estimate of Direction of Arrival at Microphone Array: H, g$ b, M3 e. ?- c! T
% Frequency-domain delay-and-sum test" B) u* Q, O8 A4 Z- l0 O+ u3 u
% ( m7 C) U' P" `2 r
%%/ |. S8 a# \3 B! k
. R8 l4 `- [* Y- Y2 n- o% x = filter(Num,1,x0);6 E0 d7 O" i5 ] C' I7 u, j
c = 340.0;/ r8 u! P- X4 s
9 l( c% o8 R( H% L$ k4 v. m
% XMOS circular microphone array radius+ s! S0 D' ^( s; f0 X1 A% B
d = 0.0420;
5 I9 O; n' r8 r# n, C% more test audio file in ../../TestAudio/ folder
. |( C6 E6 m- V: [/ t2 qpath = '../../TestAudio/XMOS/room_mic5-2/';
- @ G" G3 X4 P( P" T[s1,fs] = audioread([path,'音轨-2.wav']);- \% \3 T( Q6 @# R9 ^3 m
s2 = audioread([path,'音轨-3.wav']);4 l3 d: W8 I. D0 ~
s3 = audioread([path,'音轨-4.wav']);( A5 E+ H+ c; j7 K
s4 = audioread([path,'音轨-5.wav']);
, `& f$ J$ g" a; I6 y' X5 bs5 = audioread([path,'音轨-6.wav']);
% S0 U! W: t& z& D5 Us6 = audioread([path,'音轨-7.wav']);
7 T2 E, q' h! G+ m2 h9 Psignal = [s1,s2,s3,s4,s5,s6];% I- W) N9 f# H* k i$ G
M = size(signal,2);
- \8 w/ j/ y- ?+ P2 q$ t I%%# O" `$ }/ ^( A9 [' e2 P# D
t = 0;2 `! w) Q" a. D2 h
( C+ ^! ~9 p4 n R
% minimal searching grid7 o. j1 T( n" U. G- a* p
step = 1;
# b' h5 Q! T2 r0 i+ K5 ]" U/ w1 f3 t9 [8 n2 _* m
P = zeros(1,length(0:step:360-step));
2 B! S+ R) ~0 s: u5 M: Atic( {2 \3 L; X# ]) T$ g
h = waitbar(0,'Please wait...'); D5 H: C8 ^( l, x5 f% }& D! d
for i = 0:step:360-step
0 v- _( u. i) W0 i; z % Delay-and-sum beamforming
6 D. L0 E- c9 l x7 E/ n. I ] [ DS, x1] = DelaySumURA(signal,fs,512,512,256,d,i/180*pi);6 O; \+ |( Q4 D( z
t = t+1; C+ J$ V5 g4 \
%beamformed output energy# m- F! G7 Q; ^/ w7 Q* m
P(t) = DS'*DS; C/ u" O2 h6 T( B
waitbar(i / length(step:360-step))! T2 t* B( [7 H9 Q$ z- t
end0 k& @4 z# X; X- O4 i
toc: t8 q7 \3 c1 s/ ]! U# @
close(h) : O _! k5 w0 @9 h: n3 [) I
[m,index] = max(P);
% c, q r/ b4 ^9 {, r- D# Efigure,plot(0:step:360-step,P/max(P))
4 X6 W$ ]+ [$ z, @4 nang = (index)*step5 M# X8 B6 i/ [0 \
$ ]. m u7 C& U7 P; B# H
程序中用的是圆阵,可以进行二维方向角扫描,不过这里为了简便就固定了俯仰角,只扫描方位角,结果如下2 x$ B2 X% C4 D: ~$ g& ^' s
- y8 A' t( t. }# \
结果与预期相同 PHAT加权 与GCC-PHAT方法相同,这里也可以对幅度做归一化,只保留相位信息,使得到的峰值更明显,提高在噪声及混响环境下的性能 ' B6 [$ x5 Z5 G; z4 Z
上面代码中加上这一句
N N8 L5 e, w8 a%x_fft = bsxfun(@rdivide, x_fft,abs(d(1:N/2+1, ));
" \. m# s) l# T5 F. z& e2 K测试同样的文件,结果如下
: X6 n' X6 U u J: z+ _, ]1 ]- q1 [![]()
_8 f* P# Z: [, @2 e对比可以看到,PHAT加权的方法性能更好2 e7 a3 P0 M8 u6 W
参考. Y! m* n# r, |" d/ m( k8 v9 y
1.《SRP-PHAT-A High-Accuracy, Low-Latency Technique for Talker Localization in Reverberant Environments Using Microphone Arrays》 - i6 s; I0 S, P
2. 《传感器阵列波束优化设计与应用》' f& S! l1 [( q, B' u( L8 i
————————————————6 u! E% I8 U( y* A) ]' F4 H
版权声明:本文为CSDN博主「373955482」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。
: n2 J2 c$ c+ F( F原文链接:https://blog.csdn.net/u010592995/article/details/81586504
. ~6 g& M1 P1 p1 {5 D' w+ O
& L/ i9 E) X. Q& Z1 m4 I8 O- D2 c- l+ l
! n& N$ y2 N: Y I, s3 v
2 `, ?; a8 |; {0 A! D
6 q% \0 f6 v- |$ d# y5 Z" }, [, D) v
|