DOA
2 m. M4 C& {- Q" w+ J9 W$ y, D 声源定位方法一般可分为三类,一种是基于TDOA的两步算法(two-stage algorithm),一种是基于空间谱估计如MUSIC等,还有就是基于beamforming的方法,也就是这里要介绍的可控波束响应(steered-response power),, q: i {. @" s: o! R# ^$ l4 f3 _6 X
6 r9 h3 |9 W. Y& w8 Vsteered-response power- X- Y3 k' E! y* k( B6 W
可控波束响应是利用波束形成(beamforming)的方法,对空间不同方向的声音进行增强,得到声音信号最强的方向就被认为是声源的方向。 9 z7 T1 E& m5 H4 @3 I2 s
上一篇中简单介绍了麦克风阵列的背景知识,最简单的SRP就是利用延时-累加(delay-and-sum)的方法,寻找输出能量最大的方向。 5 d! A" C0 r6 ]4 q$ F* O3 X
其中,语音信号为宽带信号,因此需要做宽带波束形成,这里我们在频域实现# V" M# Z# Z7 l |& `
7 D2 S& B7 w, H/ E4 |) Y频域宽带波束形成
) `3 e1 u" Q1 d% l+ p9 v 频域宽带波束形成可以归类为DFT波束形成器,结构如下图
" l' a, w# a9 c2 ?* G/ Y _3 r/ I![]()
; _* J# e1 }7 Z! w! H! e) _; W: K. l
频域处理也可以看做是子带处理(subband),DFT和IDFT的系数分别对应子带处理中的分析综合滤波器组,关于这一种解释,可参考《传感器阵列波束优化设计与应用》第六章。 频域宽带延时累加波束形成的基本过程就是信号分帧加窗->DFT->各频点相位补偿->IDFT 3 {, v5 r' }5 H& Z
代码实现如下
t8 l& c! J, `9 n6 K+ E9 i0 W/ nnction [ DS, x1] = DelaySumURA( x,fs,N,frameLength,inc,r,angle)
/ L: c& [3 C# `8 y. M% A% P8 T%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
) q( q! b% `6 j/ P3 U' o* ]%frequency-domain delay-sum beamformer using circular array
( ]; M9 i- x% E q# E" u1 Z$ i%
. D, X2 @2 m/ q3 t7 Q$ N! W3 e: I% input :
3 V: c- |+ I! N2 M9 N+ w9 H' T% x : input signal ,samples * channel/ M, X% d5 k7 v1 ~3 R2 k( b9 X
% fs: sample rate
( t$ @: a+ G3 U; @% N : fft length,frequency bin number
! Q+ {& r6 F! c& U: p9 t$ j%frameLength : frame length,usually same as N* c/ x3 b' f/ I( c
% inc : step increment
. p f( J; v8 c3 T, d% r : array element radius* _+ W( n# g) V! u; H
% angle : incident angle
( m6 m- b4 n4 {! E$ [%6 L& E0 |' s, H- G& M8 d8 u
% output :
0 Z$ h; p4 d8 f% DS : delay-sum output
7 G- Z! \( }" O" e, G% x1 : presteered signal,same size as x- v2 q0 r3 Q' G" p* q
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
* Z# u; i/ ?9 I% v4 ^0 j- [( u
+ ^* B: A1 Q2 Kc = 340;6 w$ E% G, |- Q! F& ^4 N5 `( {4 k- r
Nele = size(x,2);) ^/ }% \& T- l7 E6 s5 _' A
omega = zeros(frameLength,1);2 t! d: ? P4 s, [( \
H = ones(N/2+1,Nele);
) I8 E3 s* V( W1 Z" ]6 m; i: X
2 R# \8 i. b( E, `4 itheta = 90*pi/180; %固定一个俯仰角) Y+ B! r+ k* \" F* M- t
gamma = [30 90 150 210 270 330]*pi/180;%麦克风位置1 a8 D7 V& M) s2 A4 t5 u
tao = r*sin(theta)*cos(angle(1)-gamma)/c; %方位角 0 < angle <3608 n4 d, y$ M+ H! ]0 l) ~
yds = zeros(length(x(:,1)),1);+ v3 ~+ \' b0 F0 T$ {
x1 = zeros(size(x));
( a, W9 x' v1 M# g+ s/ N6 m7 k# ?0 v0 R- z/ I% f
% frequency bin weights7 y( {% u) h3 @' m4 R A
% for k = 2:1:N/2+19 e m. a+ o7 t# b6 d& A& P
for k = 1:1:5000*N/fs2 l8 E2 c/ g( y" [; z2 N9 Q
omega(k) = 2*pi*(k-1)*fs/N; - @+ N; q! i8 O' X6 K. r
% steering vector D5 k( w9 @! w3 o* Y3 \
H(k, = exp(-1j*omega(k)*tao);& N) @* t5 |* I# `2 P/ ?! A
end
0 Q: z5 S2 F5 ~3 @+ X0 \) c" e, C4 Q# ~) e4 @$ T
for i = 1:inc:length(x(:,1))-frameLength
( w y2 D( U. i) c- g3 `+ g% }% }) Y! G# T% i3 y/ D$ W+ }! ]
d = fft(bsxfun(@times, x(i:i+frameLength-1, ,hamming(frameLength)));
# B+ }5 S N/ F+ K9 Q* x8 V
1 i. m, \9 F8 T. g3 d x_fft=bsxfun(@times, d(1:N/2+1, ,H);
. j0 t8 o, S& F! ~2 @8 P4 W9 P6 r3 E# v2 {
% phase transformed! M3 K4 [+ B" l. w6 X% d+ u
%x_fft = bsxfun(@rdivide, x_fft,abs(d(1:N/2+1, ));
+ K1 K7 `, q X/ ] yf = sum(x_fft,2);3 ^* e. a) } i$ m6 p
Cf = [yf;conj(flipud(yf(2:N/2)))];
0 g7 W. } @$ h1 F. A( h9 E
4 v# `/ m9 s8 I; d# U % 恢复延时累加的信号# G, B1 N: o' J0 ]0 o- d% g
yds(i:i+frameLength-1) = yds(i:i+frameLength-1)+(ifft(Cf));
$ q# E' l Z4 S
( c& \9 f% j; p' x % 恢复各路对齐后的信号
& J" ~3 w9 R# R/ N! n6 ~- W xf = [x_fft;conj(flipud(x_fft(2:N/2, ))]; \4 L z3 g2 R
x1(i:i+frameLength-1, = x1(i:i+frameLength-1, +(ifft(xf));
, m- K! R3 Z! A' ]end/ f+ F" m% _, `( J
DS = yds/Nele; 0 L6 k& Q9 o# Z9 @, k! w9 Y
% F8 ?7 X4 r- |; }' X; S
end
3 i, F. d5 @# X% C1 x8 \; U) E然后遍历各个角度重复调用这个函数,测试实际录音数据,代码如下: h0 F6 ~ d9 N; N, L: ^
8 y+ j% p! p% v3 @7 C. l$ b8 o) }%% SRP Estimate of Direction of Arrival at Microphone Array
" H- {4 e$ s* O0 P8 j7 Y% Frequency-domain delay-and-sum test
8 P M- Y+ D2 C%
* w% e) }0 l# H3 ~%%
! z1 W" i+ |7 {/ E& G# k" e6 E R/ B) i! C) x4 \7 ?) [2 S j
% x = filter(Num,1,x0);
, H0 O* [8 a" \4 `! Mc = 340.0;
2 W2 r/ ?0 H6 J( T) G
4 |+ }2 b# ]4 E0 F% i' R% XMOS circular microphone array radius# s: b8 a/ N' n
d = 0.0420; r) [& w* ?/ p
% more test audio file in ../../TestAudio/ folder7 D# \' H. s+ i' w+ n- X+ m& ?- F
path = '../../TestAudio/XMOS/room_mic5-2/';6 ]1 q* Z0 w7 Y5 }1 M( Y6 @
[s1,fs] = audioread([path,'音轨-2.wav']);
( f% L* j$ m8 i- q6 x: Xs2 = audioread([path,'音轨-3.wav']);9 u1 J; S# t( v& R! ~6 p- E* ?' \
s3 = audioread([path,'音轨-4.wav']);4 U! ]6 Z1 y4 Z" R! ?
s4 = audioread([path,'音轨-5.wav']);2 I$ X$ k+ g8 ]) C: q: l
s5 = audioread([path,'音轨-6.wav']);7 ~+ J& w- ?% I+ A, }' Z; R
s6 = audioread([path,'音轨-7.wav']);1 \4 i& t L% e ^0 T) `7 d
signal = [s1,s2,s3,s4,s5,s6];
$ F5 a+ B+ e- k+ X4 [" P+ ~M = size(signal,2);
5 B% p1 U- n: J* u3 ~" M7 }%%
; g7 t1 q* N( I4 p3 F7 it = 0;# W" U' L0 M9 k& V
$ o, \" n P5 V3 E8 D0 L% minimal searching grid
( V" y3 j+ g- a' u7 P. t Tstep = 1;
1 r, W6 B( F& d3 c& u& Q2 o. W/ x) b9 W2 v( v8 {7 V
P = zeros(1,length(0:step:360-step));
+ q/ E: I4 ]9 ytic
* z5 D" I% l9 n" qh = waitbar(0,'Please wait...');0 J3 h; e. c6 [5 ^) x! e8 A
for i = 0:step:360-step
! f1 e4 `2 i# X0 J, R% _ % Delay-and-sum beamforming# n6 G% ?5 f( z
[ DS, x1] = DelaySumURA(signal,fs,512,512,256,d,i/180*pi);6 V! g2 o+ l* a& v% v( o
t = t+1;. f. e+ N- k! H1 z
%beamformed output energy
# T3 L8 I# o8 p- M) @ P(t) = DS'*DS;
1 X; }" o, [0 R waitbar(i / length(step:360-step)), y; ^" N3 k. P+ W' y
end7 u1 q1 W. G' m. v& f/ F# m, F
toc
7 c6 b2 k6 A6 N3 Q0 r, P& \0 |6 Yclose(h)
, u+ O7 }7 S4 w. _5 i8 P: R[m,index] = max(P);; m( Q2 W) l, @% e4 S5 R9 K f
figure,plot(0:step:360-step,P/max(P))4 X2 Q3 _* l. j/ t9 H2 \
ang = (index)*step
# U, }6 ~ L. s& {9 i8 b. Q+ G$ \0 q+ F
程序中用的是圆阵,可以进行二维方向角扫描,不过这里为了简便就固定了俯仰角,只扫描方位角,结果如下8 M/ D4 G# ^7 _( e
# T' o8 U. p$ K, Y m4 r( u
结果与预期相同 PHAT加权 与GCC-PHAT方法相同,这里也可以对幅度做归一化,只保留相位信息,使得到的峰值更明显,提高在噪声及混响环境下的性能 + T: S2 R5 k1 U! T
上面代码中加上这一句 / K1 c5 {3 a& P6 i
%x_fft = bsxfun(@rdivide, x_fft,abs(d(1:N/2+1, ));* L+ j/ Y% O/ ~" C
测试同样的文件,结果如下 $ N c/ @4 l3 y# W$ z
' M6 K0 ]* R/ F* `5 o4 R
对比可以看到,PHAT加权的方法性能更好7 U/ H( K, B: Z" A% E3 k
参考# R0 `8 p7 a1 Y$ `1 i: }! a, X- u
1.《SRP-PHAT-A High-Accuracy, Low-Latency Technique for Talker Localization in Reverberant Environments Using Microphone Arrays》 % u( G$ m( x5 ^7 A- A a7 G, t
2. 《传感器阵列波束优化设计与应用》0 k) b/ T7 R& {$ R+ `+ r
————————————————
; p2 o" B" ?9 A( A7 P8 a版权声明:本文为CSDN博主「373955482」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。
& ]. d) h) R8 Q6 J# `原文链接:https://blog.csdn.net/u010592995/article/details/81586504* d# N% d( X6 E( |
; g/ k: C8 v' _" X; [% W
. y; L' ?* B* p0 t
+ A" P+ ?- w/ e/ |& r( v5 [( h' _4 p9 V# |+ h( F
! K: x$ n, s; s/ }5 { |