DOA$ q3 W" {9 a7 x' q1 P
声源定位方法一般可分为三类,一种是基于TDOA的两步算法(two-stage algorithm),一种是基于空间谱估计如MUSIC等,还有就是基于beamforming的方法,也就是这里要介绍的可控波束响应(steered-response power),/ E: P$ ]/ N: O7 a7 e! ?( h
0 m7 s* e4 t5 A1 d, c5 ^$ o
steered-response power
* L) K* s( m3 T3 A# q. J 可控波束响应是利用波束形成(beamforming)的方法,对空间不同方向的声音进行增强,得到声音信号最强的方向就被认为是声源的方向。
6 A; M; e. l$ F% ?5 T: O7 _6 { 上一篇中简单介绍了麦克风阵列的背景知识,最简单的SRP就是利用延时-累加(delay-and-sum)的方法,寻找输出能量最大的方向。
u/ @, ]% T" I: T, } 其中,语音信号为宽带信号,因此需要做宽带波束形成,这里我们在频域实现* n1 H$ o5 j/ I& n; j& E
" m: E( N5 R) t
频域宽带波束形成
z7 i/ p, s- Z0 l1 I 频域宽带波束形成可以归类为DFT波束形成器,结构如下图 5 E, R; I; Y2 a! i0 Q6 n( ?; A
~2 c3 d, ^8 H1 e( n
# h5 X) ^. F. c/ S$ |+ g l0 x; [频域处理也可以看做是子带处理(subband),DFT和IDFT的系数分别对应子带处理中的分析综合滤波器组,关于这一种解释,可参考《传感器阵列波束优化设计与应用》第六章。 频域宽带延时累加波束形成的基本过程就是信号分帧加窗->DFT->各频点相位补偿->IDFT
+ H; w1 q" i7 A5 e& d& N$ z代码实现如下
% r/ Y5 H$ n- _* k4 L! l; I; knction [ DS, x1] = DelaySumURA( x,fs,N,frameLength,inc,r,angle)
m% b) ]( v+ M+ w2 a$ O0 r- o%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
3 K+ I7 X: E- R% `; x$ q%frequency-domain delay-sum beamformer using circular array) d, V1 X) R D; G. n. t
% & x) g/ B# p) M+ O
% input :
' M3 M0 M3 ]* f% x : input signal ,samples * channel
$ b y6 o% U u3 p1 v9 O8 s" q( n% fs: sample rate
/ P q0 f& N1 T! R+ M6 U Z; [% N : fft length,frequency bin number
# P1 D4 b. r6 a" k" H. b3 e: l3 O" ^%frameLength : frame length,usually same as N
1 R& V" W8 X# F4 p. V( e6 {2 R% inc : step increment) f4 h5 B3 W; x7 h9 ~- s
% r : array element radius$ W8 H' o7 i" ^9 [( ?
% angle : incident angle7 a h" L8 E, x8 S' P5 y; ?
%
# t$ J6 {# N* z/ z3 N+ s! Z% output :
4 m3 O2 H g" H3 n; I% DS : delay-sum output
5 z. }# Q8 E3 I; J% x1 : presteered signal,same size as x; L q5 R" a% n0 H0 j' }& L+ v7 \
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1 W8 V6 i2 o# s6 t& u+ K
* A1 @& ~" `9 M& t( gc = 340;
B; B% Q, `9 F- d5 K1 d/ L, gNele = size(x,2);
9 x/ x3 K* x8 {( D3 V6 F \omega = zeros(frameLength,1);
, e+ d" P A+ Y( O5 `H = ones(N/2+1,Nele);% P% o$ V+ @! w* ~- T# `
; x$ [: @/ u- Z8 u# [+ itheta = 90*pi/180; %固定一个俯仰角: K o' T; w7 X6 _* i. n
gamma = [30 90 150 210 270 330]*pi/180;%麦克风位置
' \0 n' A) {; g c# `' jtao = r*sin(theta)*cos(angle(1)-gamma)/c; %方位角 0 < angle <3600 k5 \8 e' Z6 H& f o6 Y) x
yds = zeros(length(x(:,1)),1);1 _* @8 `! ]9 d, t( {* l) E+ E' K
x1 = zeros(size(x));
9 H4 Y! g: N; }1 |3 {
6 i* J3 U& p" a5 H4 ^- ^% frequency bin weights
! w R% Q( b) z: [0 ~& @& D% for k = 2:1:N/2+13 e3 M# E1 @! y6 d
for k = 1:1:5000*N/fs4 C8 i. t( T$ W6 ?* q
omega(k) = 2*pi*(k-1)*fs/N; 3 e9 y& o% E2 ^2 T
% steering vector
- e, r8 s1 V( k/ ^; x6 \- A8 L H(k, = exp(-1j*omega(k)*tao);! U- W U$ v$ v' z
end: {" }. K- X5 K
! P6 B/ ^6 e+ r- t0 R8 w9 E: lfor i = 1:inc:length(x(:,1))-frameLength
2 K2 u. [) Z! a: _% i. H! _( G3 K
5 X9 y1 i2 }; y% E, Y d = fft(bsxfun(@times, x(i:i+frameLength-1, ,hamming(frameLength)));
5 k2 \% W9 p6 N3 q( i5 q3 A- n' s% b
x_fft=bsxfun(@times, d(1:N/2+1, ,H);- N9 u0 X, v+ H0 v* I3 M4 i
6 i/ b! M9 \" [8 w % phase transformed
' b1 c8 s+ y4 P( L% A9 n %x_fft = bsxfun(@rdivide, x_fft,abs(d(1:N/2+1, ));
/ c: R1 h! c$ {0 ~' v yf = sum(x_fft,2);
2 ?, ^9 Z" A0 @ Cf = [yf;conj(flipud(yf(2:N/2)))];
6 ?. b" E t4 k) s( t. {+ P; f0 @. T" }
% 恢复延时累加的信号
( R) n& ^) ]0 A( O yds(i:i+frameLength-1) = yds(i:i+frameLength-1)+(ifft(Cf));, M% Z8 c; K. C* Y/ ]. V
" ?$ {1 ] Q9 T, n% x3 T
% 恢复各路对齐后的信号2 `& ]# W8 R7 X5 H6 D
xf = [x_fft;conj(flipud(x_fft(2:N/2, ))];
, `8 P% W8 e5 }3 {: Z. s, \ x1(i:i+frameLength-1, = x1(i:i+frameLength-1, +(ifft(xf));
4 E R4 M+ A: u) D& e p6 Uend' \/ i, H. b/ {! X
DS = yds/Nele; 0 T% b. w, u! I2 l- `: S, }
1 |' X) W2 @. C
end! L' H" S/ `3 F! Q; h+ G- D/ v; o
然后遍历各个角度重复调用这个函数,测试实际录音数据,代码如下
9 O' v3 N0 K% x6 }: B" J+ c
( @0 ^) F1 i! q' [/ G5 o6 I%% SRP Estimate of Direction of Arrival at Microphone Array n7 C8 @( p4 j1 d4 ~/ f; Y. k
% Frequency-domain delay-and-sum test
4 x1 ^6 y6 k B1 V' D% % j6 [+ H1 P0 U
%%* P( g. j5 H0 ^
7 u! j& E3 ~' p% x = filter(Num,1,x0);0 E J6 M& i3 p' h6 p
c = 340.0;
9 ?3 i. }% g! K% X! P: J1 h; V* R7 _2 C8 ~8 d
% XMOS circular microphone array radius% t1 e, k- V0 P0 u. j
d = 0.0420;( t3 i H3 s. ^3 G- i
% more test audio file in ../../TestAudio/ folder
" n z+ h' }# s- G0 Upath = '../../TestAudio/XMOS/room_mic5-2/';, Y) H" p) ]$ S" n. s, w+ |; ?( y
[s1,fs] = audioread([path,'音轨-2.wav']);" c( k; \, x- r4 d1 c9 S3 A, w& |
s2 = audioread([path,'音轨-3.wav']);7 J; X8 p5 r- K% L+ o1 Z5 P9 z* d
s3 = audioread([path,'音轨-4.wav']);
2 j* S7 J; p% _/ {0 Q9 b5 a* Ps4 = audioread([path,'音轨-5.wav']);$ s0 l" p: G* a& d5 K1 j. D
s5 = audioread([path,'音轨-6.wav']);
+ \" r1 P/ {+ G' qs6 = audioread([path,'音轨-7.wav']);
5 D H- ]0 Z& rsignal = [s1,s2,s3,s4,s5,s6];
: {( c% G" C! j8 ]M = size(signal,2);
6 @3 z, k& }" `: g# |% e) H! r%%! ?: U& m$ ?0 A6 q
t = 0; P0 I8 A& X3 v3 B: h
/ b$ R* |6 h. y* R% o) l% minimal searching grid
- V/ \5 O% g8 M' m K; `( pstep = 1;4 K$ F7 f9 K4 v/ |+ @
. e. k1 q: P* g$ L* {! a8 d
P = zeros(1,length(0:step:360-step)); ?2 [2 G0 @: T' \2 Z( h
tic
/ \: y+ h" o' x% l5 X9 A sh = waitbar(0,'Please wait...');0 Z7 F" C2 t' Q
for i = 0:step:360-step, \2 x; ]: e# V ?) T
% Delay-and-sum beamforming
% R4 g0 R1 g! q7 g# @4 q& f$ E [ DS, x1] = DelaySumURA(signal,fs,512,512,256,d,i/180*pi);
- Y! ~) F w) e t = t+1; u4 v( b' R, B0 [
%beamformed output energy
2 y0 o% y0 D5 D O! D P(t) = DS'*DS;7 I1 n0 T4 }$ r# }# H9 D
waitbar(i / length(step:360-step))! ~' t# w2 S( P1 p. N) `
end
1 G, I/ o- u9 H+ e# q4 Wtoc
( z; P0 ?+ d( }- \) ?close(h)
; @! C9 u, o9 V4 ^1 ?4 }; T+ ]4 W[m,index] = max(P);
3 P. h; Y4 m0 p# ^' Wfigure,plot(0:step:360-step,P/max(P))
" q1 o6 {0 H/ ~* D. Z# {ang = (index)*step
, q' U3 Q5 M7 D# U4 W3 H/ ?6 g5 ?1 v# `( e
程序中用的是圆阵,可以进行二维方向角扫描,不过这里为了简便就固定了俯仰角,只扫描方位角,结果如下; Q1 Z9 J4 I1 P x$ n# i. M
![]()
/ M! e2 u7 X4 n& Y ^$ n1 g0 i结果与预期相同 PHAT加权 与GCC-PHAT方法相同,这里也可以对幅度做归一化,只保留相位信息,使得到的峰值更明显,提高在噪声及混响环境下的性能
6 a# r$ k7 w5 L$ K 上面代码中加上这一句 ) W: D8 f% j( N- ^
%x_fft = bsxfun(@rdivide, x_fft,abs(d(1:N/2+1, ));
( h& i$ y* D0 c# f" Q# g测试同样的文件,结果如下
/ v; u2 V, M" T9 S5 a3 r : \9 h; G: A/ m) l
对比可以看到,PHAT加权的方法性能更好
& B2 G4 W$ q' D1 w参考/ B$ p8 x; N; B Q
1.《SRP-PHAT-A High-Accuracy, Low-Latency Technique for Talker Localization in Reverberant Environments Using Microphone Arrays》 ( F/ b* s& I) l' K
2. 《传感器阵列波束优化设计与应用》
/ I' u: E9 B. e& [————————————————
3 \0 x9 ~2 v. s- K( c版权声明:本文为CSDN博主「373955482」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。
5 H: T8 v; g% Q; B+ i原文链接:https://blog.csdn.net/u010592995/article/details/81586504
* Q9 \, h) B6 M7 a* f& w- z5 W( [ K! I# Z5 ?
+ ]! B( |, a1 m: \
$ C$ Z: Z7 M' g7 d/ y; v% T
8 j! X2 s1 ?* l" r; Q7 S" o4 C7 U7 H$ L4 ]5 s: r( u0 ]3 H! W) H
|