DOA' {) |9 j0 c# f2 Z) ^$ }" c3 {
声源定位方法一般可分为三类,一种是基于TDOA的两步算法(two-stage algorithm),一种是基于空间谱估计如MUSIC等,还有就是基于beamforming的方法,也就是这里要介绍的可控波束响应(steered-response power),3 \* M2 z5 d( [1 {2 n o
7 f" L w& f( J$ M. f4 h: r
steered-response power
+ n6 W! j8 a9 ]& G9 K' B) e 可控波束响应是利用波束形成(beamforming)的方法,对空间不同方向的声音进行增强,得到声音信号最强的方向就被认为是声源的方向。
9 u% j: ~7 ?# v9 x 上一篇中简单介绍了麦克风阵列的背景知识,最简单的SRP就是利用延时-累加(delay-and-sum)的方法,寻找输出能量最大的方向。 . S8 W' v7 R1 M6 c4 Q
其中,语音信号为宽带信号,因此需要做宽带波束形成,这里我们在频域实现1 X0 h+ b+ D6 y1 \5 p$ I- b
/ r" z1 l M3 F% N" K7 a+ W5 F频域宽带波束形成
0 ]1 f5 l5 j- F- ~4 J' U 频域宽带波束形成可以归类为DFT波束形成器,结构如下图
) B Y3 I: V2 I% Y9 q' c7 p ( N& T, `2 {, Z) z( a
& L% ?! B! q+ `8 c! W0 R
频域处理也可以看做是子带处理(subband),DFT和IDFT的系数分别对应子带处理中的分析综合滤波器组,关于这一种解释,可参考《传感器阵列波束优化设计与应用》第六章。 频域宽带延时累加波束形成的基本过程就是信号分帧加窗->DFT->各频点相位补偿->IDFT + v' F- [/ M' S: m
代码实现如下
& y2 J; Y3 ?* D: `nction [ DS, x1] = DelaySumURA( x,fs,N,frameLength,inc,r,angle)
; F. ?5 j: [5 q9 z9 ^%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
$ L N% x+ R# l* S6 Q7 B* o. v%frequency-domain delay-sum beamformer using circular array
3 H+ ]. {4 |* m0 u% " N" G* C; j4 e7 {, A& N+ K0 U5 J6 y# |
% input :' _8 K1 t- z; T: [. Y$ e
% x : input signal ,samples * channel: P+ D/ j! e _! ^8 Z# }
% fs: sample rate
Q) s3 T% k/ v- z+ |% N : fft length,frequency bin number# n8 K/ |& r; s# O5 W0 D& J
%frameLength : frame length,usually same as N- [8 H" H) B1 r4 s. r; h
% inc : step increment
: Y' x; g8 o- o# J6 }% r : array element radius- M" C* A1 l: \, [
% angle : incident angle4 F# V! j8 b$ P, Q
%
& T9 a" g/ J3 S" ?3 O; O$ t# h% output :% R- ?, r8 x3 M; ]
% DS : delay-sum output
3 \& K1 V2 U. ]% x1 : presteered signal,same size as x" L2 Y& w5 s, Z u. d! m
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%- W/ V R# d" F1 ^
! B! g/ K- f& K, p* \4 tc = 340;9 A# M9 [- S- a, E0 Z
Nele = size(x,2);; o/ G/ u. e$ O
omega = zeros(frameLength,1);
' F; ]. Z; e! Z) i1 RH = ones(N/2+1,Nele);5 }4 J3 z6 t( t$ n1 w3 T
7 f3 d8 ~3 d& Y# A/ Stheta = 90*pi/180; %固定一个俯仰角
# z3 H' Q; v: ]! k7 Ygamma = [30 90 150 210 270 330]*pi/180;%麦克风位置
$ w5 ~, a1 l Ttao = r*sin(theta)*cos(angle(1)-gamma)/c; %方位角 0 < angle <3604 q9 G1 d7 I( ?* K7 q! q
yds = zeros(length(x(:,1)),1);$ k2 b4 I) p- E/ w0 i2 z( M
x1 = zeros(size(x));% H8 G+ z$ \6 ~- s
6 f7 P% Z6 N, j9 T) s$ A% frequency bin weights% \/ H7 k% W0 c5 t8 V( p- Z
% for k = 2:1:N/2+1
/ D, m& L! B W; H1 s5 h& [1 sfor k = 1:1:5000*N/fs
' |+ L0 J% \0 V7 } k3 z4 e omega(k) = 2*pi*(k-1)*fs/N; # _; j0 i3 ]9 b# h
% steering vector
& F( Q6 e6 h1 J" S5 D' Q H(k, = exp(-1j*omega(k)*tao);
, H0 o7 t6 s- |5 k+ ?end
# h' K4 h& ? r- ?- W, y0 ^! X4 E* V9 z
for i = 1:inc:length(x(:,1))-frameLength
( O4 o X/ E E
$ l2 t6 x; G1 `% W, M d = fft(bsxfun(@times, x(i:i+frameLength-1, ,hamming(frameLength)));
9 C6 d3 J E Y$ t% H' [1 Y1 u* d/ A3 c( G- g" `( P
x_fft=bsxfun(@times, d(1:N/2+1, ,H);
( o5 n; n8 A; _7 L: S
+ Q$ r9 X4 _, U. J % phase transformed3 o3 }. B8 U6 Z/ J/ |
%x_fft = bsxfun(@rdivide, x_fft,abs(d(1:N/2+1, ));
/ j" b4 S; T( O yf = sum(x_fft,2);
: o: w5 @, V2 M8 M9 Y( z7 j& n Cf = [yf;conj(flipud(yf(2:N/2)))];) W/ C8 G( p T2 ~$ i) ^
6 W$ l& k9 o4 _
% 恢复延时累加的信号
0 C! I5 u3 K6 i. d; w- X yds(i:i+frameLength-1) = yds(i:i+frameLength-1)+(ifft(Cf));4 \1 D ~: T9 T4 A
5 C* N( x' ]3 q6 {# `: h % 恢复各路对齐后的信号3 k4 _) q2 z& N/ W. Z" z! ?
xf = [x_fft;conj(flipud(x_fft(2:N/2, ))];
9 P! y) J6 _4 u) B ^0 j7 L* {7 [( v x1(i:i+frameLength-1, = x1(i:i+frameLength-1, +(ifft(xf));
2 Q$ M p* j y$ M. P' ]" X& M" yend7 l' Z3 u G$ _; f
DS = yds/Nele; + e. m! J9 K/ |# z% ]
; r, v" T4 w& F5 E8 I/ d, J; d
end
5 E- Q5 X* O3 [; w9 v然后遍历各个角度重复调用这个函数,测试实际录音数据,代码如下
* F# M9 ^( }. B) c `5 Z1 @+ k% L* U/ P, G) I' q0 \
%% SRP Estimate of Direction of Arrival at Microphone Array
8 j& N/ }+ g8 z5 s, C% Frequency-domain delay-and-sum test
" X# \. A7 S. u% j% N+ X! f n%
@$ {- j, k! y6 a7 x/ `%%
% L F, G+ @2 r
$ {( |: C+ W( J* R& s+ b ^% x = filter(Num,1,x0); g0 K0 x1 u$ l( Q/ ^; b
c = 340.0;
& a- H5 t. a' [2 I2 Z+ @
; u9 {9 A0 P V8 ]2 p% XMOS circular microphone array radius3 F& g6 F8 k4 I! G( F I( V
d = 0.0420;
_* q; x/ J5 n- K% more test audio file in ../../TestAudio/ folder
' E8 k# `' @$ m& j3 jpath = '../../TestAudio/XMOS/room_mic5-2/'; c2 L/ e" V) m7 H: R
[s1,fs] = audioread([path,'音轨-2.wav']);# K3 D a) z6 q' q E) {* u5 }
s2 = audioread([path,'音轨-3.wav']);
' V' Y5 x1 j( W' b5 R' Ns3 = audioread([path,'音轨-4.wav']);9 B3 v: m3 m" H1 H$ v; e
s4 = audioread([path,'音轨-5.wav']);
8 S& |# K2 ?4 Zs5 = audioread([path,'音轨-6.wav']);' ?, J" H) U8 q, S5 P. X
s6 = audioread([path,'音轨-7.wav']);2 q* g1 N7 v( l1 l" i! ]
signal = [s1,s2,s3,s4,s5,s6];# M3 v; W- X- m* b* D1 [
M = size(signal,2);
I; {. E9 c& f4 A- N+ w* ^# g%%
! ~3 J6 H( i" X& E8 ]t = 0; Q+ o2 h1 V% _" O8 w: B: X
! Y Z$ {- I! ~3 q7 O. j; H: u. F% minimal searching grid
4 ~/ H) K1 K+ R# Z4 `4 e$ Tstep = 1;
4 W# b4 ?$ G1 g, @3 H. ~" C
- R( i+ v' D3 J/ H3 m V/ |P = zeros(1,length(0:step:360-step));
* x" J* c8 b$ }5 R0 \5 htic+ o+ ?7 a1 ]# x, Y U( e* K0 N
h = waitbar(0,'Please wait...');" s8 o* F2 c9 E, q7 y( G4 K: P$ X
for i = 0:step:360-step6 y, r6 @- f3 B( S! K( a
% Delay-and-sum beamforming
3 m. _2 O4 z, n8 p1 ]5 t [ DS, x1] = DelaySumURA(signal,fs,512,512,256,d,i/180*pi);
i, L- L& ^6 I0 e' I" e$ B3 F O, [ t = t+1;/ c5 s$ O$ [* u6 M* i+ A5 `/ Y
%beamformed output energy
" m5 x1 r* }+ m" |7 x1 g P(t) = DS'*DS;8 O& c- B6 S* I5 p
waitbar(i / length(step:360-step))/ a# c- K8 p7 v/ C) g8 Y
end
4 W: J: B/ j# x- t0 ~& |9 {' ntoc% y1 d. V( C" T+ f" ~
close(h) 6 K: }, O* \: @* a
[m,index] = max(P);1 p3 ^4 Y+ p9 J& q5 s/ n
figure,plot(0:step:360-step,P/max(P))
0 L2 o: s* p) |ang = (index)*step
5 z! g# o2 [, N: `7 v' h9 f$ A' w- H- I8 O& [. ]( g% ~
程序中用的是圆阵,可以进行二维方向角扫描,不过这里为了简便就固定了俯仰角,只扫描方位角,结果如下4 L9 p3 Y5 G+ N
![]()
; R6 F8 y1 J1 B/ ?: d结果与预期相同 PHAT加权 与GCC-PHAT方法相同,这里也可以对幅度做归一化,只保留相位信息,使得到的峰值更明显,提高在噪声及混响环境下的性能 : B/ D2 _$ u3 R" m1 x
上面代码中加上这一句
7 V H+ Z K% t1 z%x_fft = bsxfun(@rdivide, x_fft,abs(d(1:N/2+1, ));' E: q# D" z! U) h4 B( g
测试同样的文件,结果如下
- {' S( |" T) b9 Q![]()
/ t: x8 r6 w. E6 i+ }对比可以看到,PHAT加权的方法性能更好
! U4 M! q9 t& K0 |- y( p$ h) N参考5 K: T( N' ^% [! ?# p! j: i- p
1.《SRP-PHAT-A High-Accuracy, Low-Latency Technique for Talker Localization in Reverberant Environments Using Microphone Arrays》
5 ?1 O; w5 d0 O# ?2 k4 M* [2. 《传感器阵列波束优化设计与应用》
& ~% t8 ]/ R2 ]————————————————8 L) v3 A% J8 Q0 B$ D
版权声明:本文为CSDN博主「373955482」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。
% D7 P3 [1 s- d: i原文链接:https://blog.csdn.net/u010592995/article/details/81586504+ w4 h: Q, D$ d( `1 Y# X0 p
# U- t) D# R" k2 `! g1 c, e) l$ t2 K; E# K/ Z$ V2 J0 x
& j4 C( }9 h! ?
. J2 @' @8 D7 x' l+ U Q
c7 V2 T# X- U. h |