DOA t8 \. {+ n" E, b5 L& S
声源定位方法一般可分为三类,一种是基于TDOA的两步算法(two-stage algorithm),一种是基于空间谱估计如MUSIC等,还有就是基于beamforming的方法,也就是这里要介绍的可控波束响应(steered-response power),! [& ~. K! ?; K3 [' `& T
5 b+ C- O1 u& Z) _, D& Y1 P3 }8 zsteered-response power1 p- [6 _ B! p5 k0 h( n$ x
可控波束响应是利用波束形成(beamforming)的方法,对空间不同方向的声音进行增强,得到声音信号最强的方向就被认为是声源的方向。 " b# n0 S! ]$ g4 d4 [
上一篇中简单介绍了麦克风阵列的背景知识,最简单的SRP就是利用延时-累加(delay-and-sum)的方法,寻找输出能量最大的方向。
0 `' D% Z0 V, m0 l9 ~$ O 其中,语音信号为宽带信号,因此需要做宽带波束形成,这里我们在频域实现
) m9 M& Z. Y" e3 K6 V6 w$ Z; Y4 J! S
2 ?7 [5 [! o4 E频域宽带波束形成
L9 B1 V, D# s C 频域宽带波束形成可以归类为DFT波束形成器,结构如下图
0 q2 u/ O8 F6 i 4 G7 d2 _4 M' \1 G
" F: E8 x* Q* Q
频域处理也可以看做是子带处理(subband),DFT和IDFT的系数分别对应子带处理中的分析综合滤波器组,关于这一种解释,可参考《传感器阵列波束优化设计与应用》第六章。 频域宽带延时累加波束形成的基本过程就是信号分帧加窗->DFT->各频点相位补偿->IDFT
) m7 }7 n. |) J3 c" M代码实现如下 2 {1 X- h- J% v+ X7 c# A! q6 C6 L
nction [ DS, x1] = DelaySumURA( x,fs,N,frameLength,inc,r,angle)
( V6 t' J' R/ P, A; n/ e) {- L%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
z, D' d5 b' g# \%frequency-domain delay-sum beamformer using circular array0 X% ]0 l4 i* i$ v) u. Z, }
%
5 m6 c* _+ G* L3 y5 _9 x% input :
* A7 I" U* d* ~4 S$ _% x : input signal ,samples * channel: n% G; o8 b! Q6 _6 K2 i
% fs: sample rate
* ~4 T, H+ r1 r. Q2 `" u% N : fft length,frequency bin number
' \$ s/ e$ a; o$ u7 g! i/ N) u%frameLength : frame length,usually same as N
8 O; r& \3 q& x/ a. H% inc : step increment6 C0 ]7 | ]! C8 P$ T
% r : array element radius
+ I, Q( V% s7 B& U" V* ^8 O% angle : incident angle" K: E4 h1 M7 v, }1 s$ H2 J
%+ H9 d0 R+ v8 j' e& g# n, Q$ b m, v
% output :4 B: \4 j. M1 V- ^
% DS : delay-sum output) j* V% \; i3 R* b3 w3 M' g
% x1 : presteered signal,same size as x+ j, [( b' B; B/ L6 f' K9 S
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%9 G" M2 ]7 D" G5 T& s% [
, K4 D# E6 T) Z3 k" x* u
c = 340;% H2 p: c6 J! e4 r! G4 t7 T+ k
Nele = size(x,2);
2 ~9 T5 k! s$ N# } |- j; [6 Womega = zeros(frameLength,1);
0 A7 u6 o2 _' w+ X+ Z6 R: rH = ones(N/2+1,Nele);
& I/ D+ N x* @$ s. ?% O: g9 C8 [, K4 w( |0 [4 @ x4 y5 i
theta = 90*pi/180; %固定一个俯仰角' b) E1 [- N. t/ Y+ g
gamma = [30 90 150 210 270 330]*pi/180;%麦克风位置/ y- h. }8 A8 b2 V) i" P
tao = r*sin(theta)*cos(angle(1)-gamma)/c; %方位角 0 < angle <360
3 C1 b( I; l, e% m1 V' Dyds = zeros(length(x(:,1)),1);
/ N: ?$ R" g6 [- x6 q6 ^ ?7 @x1 = zeros(size(x));
- C! N; d0 l+ F- N" Z* U5 u/ Q ]. l3 M4 o7 ?7 Y$ b
% frequency bin weights
' x+ C# ?& c; E+ D2 j5 E& T% for k = 2:1:N/2+1. f! i' g: v1 B. r5 E
for k = 1:1:5000*N/fs
0 h$ L$ Y. G- `8 R omega(k) = 2*pi*(k-1)*fs/N;
' ?1 O1 W8 x4 _# M7 D- D" X: c' x % steering vector
0 W( ]& z" y4 g' ` t/ `1 o H(k, = exp(-1j*omega(k)*tao);/ g4 {2 t6 W/ y5 p: f4 t* [; D
end
. e B$ B" S3 a. K1 b+ ^' I
8 Q# Z3 I, j, N, Hfor i = 1:inc:length(x(:,1))-frameLength3 w" s2 X- b# j( R l$ y$ I% B4 C
3 e+ H/ ]3 ?' n
d = fft(bsxfun(@times, x(i:i+frameLength-1, ,hamming(frameLength)));' n7 j* V8 _( b1 o% v
1 ~+ Z% Z0 A+ c
x_fft=bsxfun(@times, d(1:N/2+1, ,H);
8 N! b0 |- I! D6 u @) t# K) ?# y5 Q9 h5 r8 y- ]& O c* h
% phase transformed8 p: g9 w) ^% T0 }+ |3 {& x% J
%x_fft = bsxfun(@rdivide, x_fft,abs(d(1:N/2+1, ));
" G/ p/ r' n- C2 t$ z3 J7 V yf = sum(x_fft,2);& L8 |6 |( M7 ?8 K
Cf = [yf;conj(flipud(yf(2:N/2)))];
5 s$ b: g6 k$ y8 E5 T
- S* F- `% q; b. v' l. j % 恢复延时累加的信号
* K3 H; @$ ?- w# B yds(i:i+frameLength-1) = yds(i:i+frameLength-1)+(ifft(Cf));
# ?* A) y# V5 Y) C6 s# \" @4 t( k, N! Z6 f4 U( n
% 恢复各路对齐后的信号3 }; J8 I* B6 _4 m
xf = [x_fft;conj(flipud(x_fft(2:N/2, ))];2 \0 c1 T* }0 L7 z
x1(i:i+frameLength-1, = x1(i:i+frameLength-1, +(ifft(xf));. O3 R# ]% o, j* ?0 _, c
end
i( `/ Y2 @; k. N# q* s6 bDS = yds/Nele; # ^" H* f6 j) j0 ?& C+ t* \
6 _$ ]5 v2 }/ B7 A: U" G$ Y# [ C
end
; O3 r' \7 [: H- p( h+ e然后遍历各个角度重复调用这个函数,测试实际录音数据,代码如下
, A9 _+ B. M7 |2 E; M
& @ Q8 L2 c; m%% SRP Estimate of Direction of Arrival at Microphone Array. _7 m7 k) }6 i7 h8 N8 r- b
% Frequency-domain delay-and-sum test
+ p2 o4 G( l1 @, U/ k R%
9 N& L1 M& F1 J7 Q& X+ S! W2 u%%9 X( U7 H0 o5 K) ]( y2 q
+ m( A) e* F( A) z" q/ A7 [% x = filter(Num,1,x0);2 N6 j8 N9 t0 W$ z5 ]% N
c = 340.0;
7 O0 C0 ^, M0 J' u2 E" Y, K( @ V+ v" { u* l- u0 O# `
% XMOS circular microphone array radius
$ r! }# x8 u1 D% c2 Td = 0.0420;
& l/ E; @0 c% P. t5 m `( {% more test audio file in ../../TestAudio/ folder5 p5 J' X6 b4 J
path = '../../TestAudio/XMOS/room_mic5-2/';
: K$ I8 ~7 l. M# @3 I! ~2 D[s1,fs] = audioread([path,'音轨-2.wav']);
, k7 e6 i J( i8 Q7 Ns2 = audioread([path,'音轨-3.wav']);
3 Y& }: Z8 y6 }. T, Y% l, ?: Bs3 = audioread([path,'音轨-4.wav']);
3 H/ f3 h+ z- s* w/ r/ k |8 d) x! ss4 = audioread([path,'音轨-5.wav']);( L: g, v) _% m+ M8 r
s5 = audioread([path,'音轨-6.wav']);2 l0 m% f- y% y3 k7 M, ~$ p
s6 = audioread([path,'音轨-7.wav']);
0 C6 P0 Q# f- t! c" v/ ~signal = [s1,s2,s3,s4,s5,s6];
& t& _8 i0 N$ H! X- o) Y `5 iM = size(signal,2);& j8 A% ^ d/ _$ v# a8 S2 N2 i
%%/ a1 D+ q8 `* h4 `
t = 0;6 a8 K0 h3 w( p9 c2 P1 ~0 C
^7 e0 h y( F9 @4 l0 c& c* ~
% minimal searching grid
P4 e8 w2 e( T8 w) \. l3 Ystep = 1;
6 x% Y( c/ x' s3 x T! b
A( n5 s2 O4 x6 ~ \ d+ SP = zeros(1,length(0:step:360-step));
$ C5 W' p3 |* I I M1 T, atic$ _$ X- Z3 x3 l/ ~; t" B# k3 t' k
h = waitbar(0,'Please wait...');7 V; ]% o0 \6 Q1 D. K$ l
for i = 0:step:360-step$ e+ W* J& O7 U/ T
% Delay-and-sum beamforming
) T* c* U4 a ^" `! V/ C [ DS, x1] = DelaySumURA(signal,fs,512,512,256,d,i/180*pi);# X/ q' R- J* T# ^: k
t = t+1;
! D1 D8 e7 c3 O %beamformed output energy9 n0 U- C* ]6 o7 \1 w. `
P(t) = DS'*DS;- P" ]" O) O$ s6 M( u8 i& X3 T
waitbar(i / length(step:360-step))0 t- k8 c" _+ v% p4 u# ~0 b- F
end6 C9 E" u$ C" y1 @
toc
( a a2 E6 M# A6 A; eclose(h)
( R T* ]8 H+ i* @ l[m,index] = max(P);$ p/ V: U i% ?# U2 B" g, z
figure,plot(0:step:360-step,P/max(P))
- W3 }6 |/ i# m5 O% d4 a# J Nang = (index)*step
/ J3 e+ l$ |# o' t7 h/ Z5 t" m6 c, z7 y# o; a% v0 i
程序中用的是圆阵,可以进行二维方向角扫描,不过这里为了简便就固定了俯仰角,只扫描方位角,结果如下
( \# K9 n7 N0 S' E- L) D- C5 I 8 B$ b; c% H0 k4 [
结果与预期相同 PHAT加权 与GCC-PHAT方法相同,这里也可以对幅度做归一化,只保留相位信息,使得到的峰值更明显,提高在噪声及混响环境下的性能 ) [3 S H6 x8 \7 S v' {3 W
上面代码中加上这一句 , @: n: u h3 H% v
%x_fft = bsxfun(@rdivide, x_fft,abs(d(1:N/2+1, ));! E0 U! p$ _, M+ s* e- h8 K
测试同样的文件,结果如下 3 f1 e0 U3 L2 G% _4 x0 K9 i; @. K
![]()
! \) k% K2 ^( i# ~( d对比可以看到,PHAT加权的方法性能更好
) @+ ^8 r) ^& u! \ l ?" F5 m参考
9 L' T, r: S# b1.《SRP-PHAT-A High-Accuracy, Low-Latency Technique for Talker Localization in Reverberant Environments Using Microphone Arrays》
) C- g5 J1 B8 W s, j2. 《传感器阵列波束优化设计与应用》, C' r8 |* ~/ q* T
———————————————— K1 D2 ]4 \. i- [$ D" I+ U3 G
版权声明:本文为CSDN博主「373955482」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。
" T4 M" `' R3 k4 s3 Y" t原文链接:https://blog.csdn.net/u010592995/article/details/81586504
/ L+ Q5 g+ X3 P8 W& F6 o2 m2 I! ^7 U+ _ c# s
: [ t, o- F& Z# E0 E$ Q% c) m7 a; P* u4 k! B+ H4 P
! R; j' Q$ x9 Y8 x( T; y
4 c) L+ h! w* Z, i& } |