DOA
8 N/ M# G5 i9 S/ W1 \8 {/ o$ Y 声源定位方法一般可分为三类,一种是基于TDOA的两步算法(two-stage algorithm),一种是基于空间谱估计如MUSIC等,还有就是基于beamforming的方法,也就是这里要介绍的可控波束响应(steered-response power),: B& q, @9 H+ ^- K
8 Z( G8 A8 ?! o. d- lsteered-response power
4 E+ S1 `6 s( ^; k6 @* T2 r9 W 可控波束响应是利用波束形成(beamforming)的方法,对空间不同方向的声音进行增强,得到声音信号最强的方向就被认为是声源的方向。
1 V6 ? z. z! | 上一篇中简单介绍了麦克风阵列的背景知识,最简单的SRP就是利用延时-累加(delay-and-sum)的方法,寻找输出能量最大的方向。 ' E$ y( ~& ]: K$ K; _: X
其中,语音信号为宽带信号,因此需要做宽带波束形成,这里我们在频域实现; l% |3 [7 y6 I& ~! H4 j/ l% u
6 c( f% k0 \ Y7 J% ~, {
频域宽带波束形成0 z1 E7 P) @# e0 H( N M: {# ^
频域宽带波束形成可以归类为DFT波束形成器,结构如下图 / G" `% [+ ]1 w" o( e4 b$ z
![]()
" h$ b, }3 u, `: F* s) u
8 B% _- a( M. v) X0 a频域处理也可以看做是子带处理(subband),DFT和IDFT的系数分别对应子带处理中的分析综合滤波器组,关于这一种解释,可参考《传感器阵列波束优化设计与应用》第六章。 频域宽带延时累加波束形成的基本过程就是信号分帧加窗->DFT->各频点相位补偿->IDFT
3 x: Q3 F; C* m代码实现如下 , q5 G! j' Z! s) i( B! ^
nction [ DS, x1] = DelaySumURA( x,fs,N,frameLength,inc,r,angle)
* ~+ s8 S2 B+ i+ U4 y1 C5 B%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%5 X* ^9 m% v- A7 S: p
%frequency-domain delay-sum beamformer using circular array
* c# e0 N, Q: T+ F3 K% ^- r& |9 Q/ k0 D7 D
% input :
% h+ u: N, P6 z6 A$ a0 N+ D+ w% x : input signal ,samples * channel4 \2 V' [- ^+ Q2 h
% fs: sample rate
& S* J- g& b+ Z. `: x; ~& ]4 I% N : fft length,frequency bin number
* c3 I1 w( j! K' F) j s%frameLength : frame length,usually same as N
& C* G/ s5 J2 u9 `8 A0 P/ P h' J4 d% inc : step increment5 c$ w+ _9 m$ m0 n. r* V4 a
% r : array element radius4 C/ r8 H" J$ n1 u# R) `& H
% angle : incident angle7 X1 f& Z. ~) o6 \, ~ E( g7 `
%
$ s# l9 g. J0 Z* F# P( n% j) \+ g% output :
+ C1 I) z: j+ m& E% DS : delay-sum output
( Z/ Y/ n" G% Y9 X7 z% x1 : presteered signal,same size as x' c) U5 u+ N3 i) A
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%6 d7 R; U& z+ f! x. C; w
; f T( e( c" [* v$ W( |c = 340;
; x' p' h" _8 F8 JNele = size(x,2);
" q8 @! A6 R' O6 K% P7 h7 bomega = zeros(frameLength,1);# m6 R3 c, ~( a' E5 ?
H = ones(N/2+1,Nele);
& O; J2 W- p+ u3 `7 l7 F
# h0 \+ c6 t0 Etheta = 90*pi/180; %固定一个俯仰角
" y* h6 y- B/ }$ K. }" dgamma = [30 90 150 210 270 330]*pi/180;%麦克风位置% a1 J. m! r* K, B, o5 U
tao = r*sin(theta)*cos(angle(1)-gamma)/c; %方位角 0 < angle <360' a! V9 U: I" i
yds = zeros(length(x(:,1)),1);5 @$ x: S9 N+ Z- c6 z' ?
x1 = zeros(size(x));
# b9 e" ?( |! y' D" G3 r7 j
0 v) L2 _- P5 q) s0 t4 z% frequency bin weights- l: Z, ]: k) `
% for k = 2:1:N/2+1
5 x4 I' h! m4 T# V" M) n3 Bfor k = 1:1:5000*N/fs' ?. i# u. a; z) T' {
omega(k) = 2*pi*(k-1)*fs/N; * ^7 L4 L3 L6 x
% steering vector
: G. s" y$ D0 w" A! | H(k, = exp(-1j*omega(k)*tao);
1 n, |' E* |9 ^8 r1 Rend
K( B6 i$ j& X b5 j9 m7 h& D( x* | \- p6 B
for i = 1:inc:length(x(:,1))-frameLength4 `# \- F5 Z& u
$ A2 q b4 r4 h8 w d = fft(bsxfun(@times, x(i:i+frameLength-1, ,hamming(frameLength)));
% Z) E. Z0 V# ?+ `- D, U
3 N7 P6 k. {6 i x_fft=bsxfun(@times, d(1:N/2+1, ,H);
( m8 B! O* y' q) \. `+ C' \' J' m! D3 t
% phase transformed, X' [' V4 l( R3 d1 F+ l+ _
%x_fft = bsxfun(@rdivide, x_fft,abs(d(1:N/2+1, ));
/ W, E, m0 F& ~( M+ ?) \ yf = sum(x_fft,2);/ e; w0 F5 m$ V3 D1 d
Cf = [yf;conj(flipud(yf(2:N/2)))];
# v9 h+ m, f' ], J/ N) C* z. R9 h/ W( ^2 H$ S8 M0 P
% 恢复延时累加的信号! R/ p$ i: w/ p/ @2 A5 l+ R; d
yds(i:i+frameLength-1) = yds(i:i+frameLength-1)+(ifft(Cf));+ {' i, b: e6 _; r2 m+ s+ n, F* w
t; U/ k6 z+ b4 x2 E
% 恢复各路对齐后的信号4 Z/ v9 E8 Y# s7 P5 Z
xf = [x_fft;conj(flipud(x_fft(2:N/2, ))];- }# o5 u- i2 ]
x1(i:i+frameLength-1, = x1(i:i+frameLength-1, +(ifft(xf));
) Q5 S! |8 f! G( _4 ?; D& y$ xend
4 \7 s/ ~" O1 J" KDS = yds/Nele; - b1 s) h9 T! G; U5 ~& q( ^5 ]
# p4 Z3 M; m$ J
end
, i+ ^9 C2 K% e) I. Q# _然后遍历各个角度重复调用这个函数,测试实际录音数据,代码如下' k3 I/ o F, y' c* b
$ }# a6 ]7 l6 ?; h5 T& I2 C%% SRP Estimate of Direction of Arrival at Microphone Array
8 C9 ^3 @, t% P: T7 Z% Frequency-domain delay-and-sum test
2 h5 J9 i- g# b%
. \5 u+ U' q( ]; X%%
" [1 |7 K. M$ Y8 V& V. o/ U# B" |. y/ V6 |1 x$ ?
% x = filter(Num,1,x0);
; @& ~2 H( d$ E1 g8 Oc = 340.0;# a1 V" B _$ Q0 f, {
- \8 B& ]3 ]$ ? O h3 y
% XMOS circular microphone array radius
& n7 J3 w9 Q) v( ^* \d = 0.0420;
4 ~* ] \" ~: ?! C2 s& i6 X7 l% more test audio file in ../../TestAudio/ folder
2 G( |) {4 x; I4 }. jpath = '../../TestAudio/XMOS/room_mic5-2/';( {* t0 r+ {$ I/ A
[s1,fs] = audioread([path,'音轨-2.wav']);8 @' m) b+ ]5 V
s2 = audioread([path,'音轨-3.wav']);
" V& ?4 ^3 ]8 Q7 U' ~s3 = audioread([path,'音轨-4.wav']);2 N- D/ ^+ B" u7 I/ O$ o+ g
s4 = audioread([path,'音轨-5.wav']);4 j) F7 E3 Y+ t. a
s5 = audioread([path,'音轨-6.wav']);
5 I; t4 C1 n2 p- b& ds6 = audioread([path,'音轨-7.wav']);
' c: h. y6 _4 csignal = [s1,s2,s3,s4,s5,s6];
+ `0 c/ b3 Z( j/ x: }3 I; z) NM = size(signal,2);
0 f$ E! R7 Q! X* V' t* q9 U%%
& t4 l! [3 E l4 M$ T5 o/ G" U; D; Ft = 0;
7 h: p8 |% ^4 Y; B& o, T0 a; [6 C3 `! z4 m% T4 N/ c( [6 r/ P! u
% minimal searching grid1 L" \/ a( r1 I+ T- G) T
step = 1;
# D7 T( \: J# O9 I( Z9 A( ], g. ~
6 m; e) n& e' i/ @9 [* WP = zeros(1,length(0:step:360-step));: E( Y* e K8 D0 Q$ J( N- ^) k* {
tic
; W L* V$ K( K* g/ mh = waitbar(0,'Please wait...');! V; W' _. ]' [% _& B. e
for i = 0:step:360-step4 k2 W7 F) u% J1 w. N
% Delay-and-sum beamforming9 f& m: y( f# ?/ `& Y* x
[ DS, x1] = DelaySumURA(signal,fs,512,512,256,d,i/180*pi);
& z2 B8 t y( j1 s6 s, h' H4 I t = t+1;- N* ^: x: V) B; L) p% R; N* A' W
%beamformed output energy/ h/ u1 m" }; @" j) }. j
P(t) = DS'*DS;6 H) i$ ]) Y8 e
waitbar(i / length(step:360-step))& ~) E5 l4 Z/ @& Q. a" e# G
end7 `( F' K6 c3 C' @0 B6 I
toc: f7 n: D0 [3 T9 G3 z! b* h0 C4 J+ y8 E
close(h)
$ G5 f5 P+ W3 x, y0 i[m,index] = max(P);; ]; f5 f# G% V& ^1 X! B: W
figure,plot(0:step:360-step,P/max(P))0 L t \% l$ X
ang = (index)*step( W4 n5 t% P% w- }7 Y- t0 h
2 G2 M9 N8 Q3 u
程序中用的是圆阵,可以进行二维方向角扫描,不过这里为了简便就固定了俯仰角,只扫描方位角,结果如下
; _7 j* o% G& ]7 B* _# v 0 h# }2 f6 p2 u# Q$ ?
结果与预期相同 PHAT加权 与GCC-PHAT方法相同,这里也可以对幅度做归一化,只保留相位信息,使得到的峰值更明显,提高在噪声及混响环境下的性能
3 {% t/ I# @" Z8 w; t( C+ f2 J- i 上面代码中加上这一句
" w# Z3 S4 u t: X3 J; M D2 X, G6 W9 K%x_fft = bsxfun(@rdivide, x_fft,abs(d(1:N/2+1, ));4 P1 }/ K9 b6 w" C3 L
测试同样的文件,结果如下 4 F, C7 s4 Q, H$ p
![]()
: l- ^) {2 U2 O' o! c! F. a7 B对比可以看到,PHAT加权的方法性能更好
3 t1 j- Q# T9 L. w, @参考
" d! x/ T2 E7 a; A: h- ?) N1.《SRP-PHAT-A High-Accuracy, Low-Latency Technique for Talker Localization in Reverberant Environments Using Microphone Arrays》 $ G3 M, B6 H8 z& z/ J0 r
2. 《传感器阵列波束优化设计与应用》
* D# f- O( d ^; A# S7 H; e————————————————
4 F1 H5 h! g/ s! l" C版权声明:本文为CSDN博主「373955482」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。8 d0 ^) @) a7 A' a' n
原文链接:https://blog.csdn.net/u010592995/article/details/81586504& J- S5 c* b: y* d
( t* h3 f0 ]% v+ p
4 h8 W% ? w% s1 K0 Y! L
; I& G$ c8 \2 @" T8 l2 M$ `9 j- V3 U3 g6 u0 Z' _0 V7 o
/ Z+ O7 k9 T+ c% J& I/ k+ D! F- ^
|