DOA* \ J8 t+ x2 l7 ~" r, s
声源定位方法一般可分为三类,一种是基于TDOA的两步算法(two-stage algorithm),一种是基于空间谱估计如MUSIC等,还有就是基于beamforming的方法,也就是这里要介绍的可控波束响应(steered-response power),
/ l' W- a+ e8 _( T* m- p }
, P0 R* f$ {% @; c6 d6 g8 Vsteered-response power
( E! n n! [$ W6 ?* I' t 可控波束响应是利用波束形成(beamforming)的方法,对空间不同方向的声音进行增强,得到声音信号最强的方向就被认为是声源的方向。 1 v- t2 j0 Q6 v L7 `
上一篇中简单介绍了麦克风阵列的背景知识,最简单的SRP就是利用延时-累加(delay-and-sum)的方法,寻找输出能量最大的方向。 % T- P3 g) `. c) v' M' C
其中,语音信号为宽带信号,因此需要做宽带波束形成,这里我们在频域实现% q2 A0 J2 L: G3 n5 e& n- E6 k
0 i1 M0 R9 w, ]% L. b7 D频域宽带波束形成
# _3 m% X& O. d5 w 频域宽带波束形成可以归类为DFT波束形成器,结构如下图
1 a1 Y( Q7 U; \; K9 t![]()
% i4 b1 o! o, N' M' O0 e6 O: k' o4 r; G$ V
频域处理也可以看做是子带处理(subband),DFT和IDFT的系数分别对应子带处理中的分析综合滤波器组,关于这一种解释,可参考《传感器阵列波束优化设计与应用》第六章。 频域宽带延时累加波束形成的基本过程就是信号分帧加窗->DFT->各频点相位补偿->IDFT 0 n- D3 H0 f" y
代码实现如下 t x- G8 t- M/ o
nction [ DS, x1] = DelaySumURA( x,fs,N,frameLength,inc,r,angle)
4 R/ I* F% c! R3 ^1 Q%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%, O$ W% u0 v3 L3 W5 w. D0 F7 s
%frequency-domain delay-sum beamformer using circular array
' @7 B2 o& z/ @6 f" Z( E%
% t' [3 F2 b% K2 i8 w% input :% m6 c, E, S/ N4 P
% x : input signal ,samples * channel0 }4 F, e. W# U+ J5 C7 N
% fs: sample rate( s& L1 ^* X5 t9 }2 i0 d1 L6 E
% N : fft length,frequency bin number
" F1 w0 O+ y* u, K/ ?%frameLength : frame length,usually same as N
, O/ h+ f1 N" ` @ ^% Y% o. X+ J% inc : step increment
0 U- L$ t& ^6 u- p" S' M% r : array element radius
: t( k7 ] G3 b5 Y" m% angle : incident angle
5 @2 |& ?- R. h. E& e. C6 [" a5 p%
0 t7 J0 L+ Y) s4 ]% output :2 l) j, j4 X% w1 x8 z0 E
% DS : delay-sum output4 h" k b' K2 @2 p0 `6 }- R
% x1 : presteered signal,same size as x+ K# z1 m' r5 v+ B2 b. Y A
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
( A# S( M5 v, y0 v; y6 g% ]0 N* z8 ]
c = 340;, p+ b) V5 [0 I, n& L9 w
Nele = size(x,2);
0 u8 t6 @. Z S5 ?omega = zeros(frameLength,1);" k8 D7 y! n& u' G
H = ones(N/2+1,Nele);/ I9 X' y! ~7 g8 d" W- l% N1 X
& i o$ V# K3 ? btheta = 90*pi/180; %固定一个俯仰角
7 w- ]* @- J8 t3 {; [gamma = [30 90 150 210 270 330]*pi/180;%麦克风位置
' J" j% s* @, [tao = r*sin(theta)*cos(angle(1)-gamma)/c; %方位角 0 < angle <360
$ @+ o5 y% h& d- Myds = zeros(length(x(:,1)),1);. i( k! _7 V/ D O0 i7 ]
x1 = zeros(size(x));0 T* S+ M& c2 P, |* l
& q% y0 G+ Q' ~1 g7 ^* e8 @% frequency bin weights/ ?7 h, L; K+ E
% for k = 2:1:N/2+1
7 i, }! K7 y( j+ Y9 Xfor k = 1:1:5000*N/fs
+ b6 N; V& Y7 ^ omega(k) = 2*pi*(k-1)*fs/N;
# n5 F) B {) t' |3 K3 W+ V/ X5 m % steering vector3 Z& k: I' K9 ^& n
H(k, = exp(-1j*omega(k)*tao);
8 j- h7 u( q9 @/ Kend
5 H7 o# D; Z. z) b5 a5 d7 p/ Z* y$ e6 X/ U
for i = 1:inc:length(x(:,1))-frameLength# ~/ O. Z9 Y1 m2 e# M
! p1 N2 e2 U% r. e d = fft(bsxfun(@times, x(i:i+frameLength-1, ,hamming(frameLength)));6 I8 F# g$ Q N' _1 U" J- Q
7 v% q4 s2 o9 }# s4 ]
x_fft=bsxfun(@times, d(1:N/2+1, ,H);
' V3 _- J' e5 Y8 d/ ^6 F4 U8 j$ K
2 l) J+ A) F0 i/ e8 A# g, Q % phase transformed
8 F) l ~# C* V" G %x_fft = bsxfun(@rdivide, x_fft,abs(d(1:N/2+1, ));
" s" y- \" ]3 [ yf = sum(x_fft,2);6 ~6 W. e* D) C
Cf = [yf;conj(flipud(yf(2:N/2)))];$ W; [( N) z2 c$ e0 d. R
; K8 g% H" D, \0 k3 w/ i
% 恢复延时累加的信号 D9 A9 ~6 n, @" M
yds(i:i+frameLength-1) = yds(i:i+frameLength-1)+(ifft(Cf));
$ [% A* p+ }% r. q- }8 q$ |
3 Y: G; u7 k) H1 D/ b % 恢复各路对齐后的信号
w6 i, Z3 c! n3 O7 Z xf = [x_fft;conj(flipud(x_fft(2:N/2, ))];
* i% ?6 X9 ]: j) V) C, t# g! t4 M x1(i:i+frameLength-1, = x1(i:i+frameLength-1, +(ifft(xf));
% v! T. ? m; C. i: {end
$ _. a: W% r$ b( K: @+ g- x. z: s: pDS = yds/Nele;
4 k' ^% r7 M" R$ Y1 ]8 q, v" u8 v- R9 _8 I5 C% C
end) G0 P. C1 N. c. O8 n6 w
然后遍历各个角度重复调用这个函数,测试实际录音数据,代码如下7 J( Y2 ?- F* X; @
; I( d; t2 B# Z# X
%% SRP Estimate of Direction of Arrival at Microphone Array
# P% q0 g8 z9 J& G3 n9 o; _% Frequency-domain delay-and-sum test3 }$ \4 c) g$ M4 M
%
' D6 h' I" Q) T4 ^9 u s%%
0 _) i$ L9 Z+ p8 h, n
5 M: O6 {/ p2 x5 r0 t/ v% x = filter(Num,1,x0);- m6 @8 x$ c7 }" Q
c = 340.0;. i0 Y5 ]4 b, l' l9 ]& E- M9 h
( v6 g9 }# {' o2 c ]% XMOS circular microphone array radius& Z1 y! P; R/ x+ ~ ~: B' C
d = 0.0420;. F' \2 v& J5 J8 q$ n3 X) b
% more test audio file in ../../TestAudio/ folder. u) f$ M: [& U5 d- u- q
path = '../../TestAudio/XMOS/room_mic5-2/';: U/ b& b: J7 Y- a* d; X8 X( C
[s1,fs] = audioread([path,'音轨-2.wav']);2 j2 m. C. N, E+ O, y! ]$ A
s2 = audioread([path,'音轨-3.wav']);2 B3 {! e, u8 x- n
s3 = audioread([path,'音轨-4.wav']);
$ i, ^+ e& Q7 A6 ?# l ls4 = audioread([path,'音轨-5.wav']);
/ u: b+ F: i2 q: b$ g8 js5 = audioread([path,'音轨-6.wav']);+ }" R" c1 p3 K( Y3 @0 b! l( H" Y
s6 = audioread([path,'音轨-7.wav']);
; C, ^# Z) H$ t+ Psignal = [s1,s2,s3,s4,s5,s6];
% J3 p+ o( e- R9 f% q% C% VM = size(signal,2);
2 r7 v6 }9 h- n A%%8 \2 R2 j: d: ~ x
t = 0;
! ` h- q0 P6 ^! v' M$ d0 N0 f
% minimal searching grid9 ~: O$ \8 l, U- n! Y
step = 1;- D% _9 |1 A* o8 m, Q
: C8 b% W) i# J6 b8 g
P = zeros(1,length(0:step:360-step));
5 s/ Q+ K/ t! L; I+ P M( X% Rtic* d$ `7 m' j; u& z& F
h = waitbar(0,'Please wait...'); X2 n0 [+ F4 S! O
for i = 0:step:360-step1 w* P, \5 z. Z- w2 K2 S( ]9 y' p+ k
% Delay-and-sum beamforming
% E; j0 p7 |" X [ DS, x1] = DelaySumURA(signal,fs,512,512,256,d,i/180*pi);
4 x# f( V0 e! t. R7 S& F+ y t = t+1;
: x: r' Z ^7 s& n %beamformed output energy
+ |/ D+ z( k9 C1 y/ u P(t) = DS'*DS;
" W# W( R" }# R+ Q! n' B waitbar(i / length(step:360-step))
' z! y5 P$ `4 }$ z: U3 i1 d) Nend
. N& { C$ f- [2 b% U7 o( T1 itoc* `* ` G& t1 W4 N; i
close(h)
m; V0 _# z; o5 C; a7 c {[m,index] = max(P);
0 y' n& y+ e( b$ n1 Y1 ` |2 H L' `figure,plot(0:step:360-step,P/max(P)); x8 l/ A- M v4 f& L( k
ang = (index)*step% |+ k' K; {. t& G2 b
: O, D1 h" z8 T' p6 A% ]3 Z
程序中用的是圆阵,可以进行二维方向角扫描,不过这里为了简便就固定了俯仰角,只扫描方位角,结果如下1 d# B1 `0 G, Q9 T) I. r
![]()
4 W4 X1 } Y) q* q( C. V结果与预期相同 PHAT加权 与GCC-PHAT方法相同,这里也可以对幅度做归一化,只保留相位信息,使得到的峰值更明显,提高在噪声及混响环境下的性能 ; g1 [" V) H" s
上面代码中加上这一句
* N* z, x$ R5 e* p {! G9 {0 x%x_fft = bsxfun(@rdivide, x_fft,abs(d(1:N/2+1, ));
0 E! c7 _7 R, K& K7 p测试同样的文件,结果如下
# S4 I2 Y$ h q1 ?$ C2 k { g 4 G0 O& U- E/ D# a" V
对比可以看到,PHAT加权的方法性能更好: G# @, \1 |0 Q& }- H1 C. M9 |
参考
5 \' k# `) A4 j+ \$ T* v4 T o8 U1.《SRP-PHAT-A High-Accuracy, Low-Latency Technique for Talker Localization in Reverberant Environments Using Microphone Arrays》
5 F2 [! K9 |3 d" C% p2. 《传感器阵列波束优化设计与应用》
1 b7 ^' i* r/ W; a. P2 `- ~9 ~ l————————————————
( x# H0 [* h% b/ x8 O: U1 A版权声明:本文为CSDN博主「373955482」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。* i: O- W+ I( [5 m! W
原文链接:https://blog.csdn.net/u010592995/article/details/81586504 D' S$ F- D) A3 v( S4 [5 ]
* c. O' O1 J, `6 t" I! l4 r8 U! g8 C
+ e% U s" Y8 ~; Z! F
& Z' Q# _9 z3 ]! Q( T8 ~7 g1 d/ ?2 B; }8 {
|