DOA
+ W0 n; y' F8 {6 @8 D 声源定位方法一般可分为三类,一种是基于TDOA的两步算法(two-stage algorithm),一种是基于空间谱估计如MUSIC等,还有就是基于beamforming的方法,也就是这里要介绍的可控波束响应(steered-response power),( Y* }7 p& _5 x% i4 ` u# H
/ } a2 q; Q z. c! T
steered-response power; ~$ \: S8 s# c$ S) ^8 f
可控波束响应是利用波束形成(beamforming)的方法,对空间不同方向的声音进行增强,得到声音信号最强的方向就被认为是声源的方向。 ) q, H0 n: V# h+ s4 n: Q
上一篇中简单介绍了麦克风阵列的背景知识,最简单的SRP就是利用延时-累加(delay-and-sum)的方法,寻找输出能量最大的方向。
' f1 o" {. \2 l) T; |( C 其中,语音信号为宽带信号,因此需要做宽带波束形成,这里我们在频域实现; R0 [$ f$ r4 n }) ]" ?
0 J f9 w& U% o8 s8 S, B/ v- r
频域宽带波束形成( C1 N" x; g9 k+ y1 [7 a, L+ d
频域宽带波束形成可以归类为DFT波束形成器,结构如下图
, D" X" c+ T S, W. M 3 g6 A2 b+ E) _0 t0 n( @& P
1 Y0 l) A1 s/ n" K1 t+ S. h频域处理也可以看做是子带处理(subband),DFT和IDFT的系数分别对应子带处理中的分析综合滤波器组,关于这一种解释,可参考《传感器阵列波束优化设计与应用》第六章。 频域宽带延时累加波束形成的基本过程就是信号分帧加窗->DFT->各频点相位补偿->IDFT
4 M9 S+ w- |+ E* V. j) C9 p代码实现如下 [% F" n$ Z- ^! y5 L# d; b* b
nction [ DS, x1] = DelaySumURA( x,fs,N,frameLength,inc,r,angle)
5 R* N5 i$ \4 q/ w b+ r2 D" g%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%& ?3 M6 d1 c& o" R
%frequency-domain delay-sum beamformer using circular array5 K( L# L5 `. `% z5 U
%
) B% ]' r3 |# T6 `3 Z8 X9 p% input :
4 X, b. T$ F/ a3 J) {; C6 h r% x : input signal ,samples * channel
6 m9 A0 H2 z% @& A% fs: sample rate
; D# C; l' E! B% N : fft length,frequency bin number' l! m% n% L O- a
%frameLength : frame length,usually same as N! B/ D. A; A* G2 x# `" Q
% inc : step increment
$ g( V( j; [$ c" n! m% r : array element radius
& j1 e% ^: J! }% angle : incident angle
* c8 r/ ]) f: H+ x0 z% q# V& F%
; y' u/ G7 ]+ T3 C6 v0 Q# ?- m9 u% output :) [5 L ?- ~ f
% DS : delay-sum output7 D; Q0 j8 J$ t8 v4 i k7 E
% x1 : presteered signal,same size as x- {, t T9 b: V2 ]$ l2 H6 X; p
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% T# f! b& `) |4 q7 }+ a& m4 d) o- l
c = 340;
! G9 @ y3 d, \( ?6 INele = size(x,2);
2 ?8 z8 K$ U% ^4 \: C- W( a7 Oomega = zeros(frameLength,1);, }2 f& y2 X- P$ e. H; `
H = ones(N/2+1,Nele);
0 x$ O: L7 u5 ?* o7 W2 B P8 k- {& M7 P* z; j6 A8 `4 ?
theta = 90*pi/180; %固定一个俯仰角
: I# ?- J& p5 I. Ugamma = [30 90 150 210 270 330]*pi/180;%麦克风位置# f4 q( G* \+ Q% L8 p I* m
tao = r*sin(theta)*cos(angle(1)-gamma)/c; %方位角 0 < angle <3606 y& L3 }, Y1 [; d. g/ D9 Y
yds = zeros(length(x(:,1)),1);
, N d S d1 F. t Hx1 = zeros(size(x));
0 X) @! @* m1 I4 O/ |( o2 @. T" k
% frequency bin weights" O( j2 Z" R% Y* R
% for k = 2:1:N/2+12 ?* U& H' K3 ]) `
for k = 1:1:5000*N/fs8 x, ^( h% b4 P: x
omega(k) = 2*pi*(k-1)*fs/N; ( n" o9 V, y9 {7 d2 p- q$ l
% steering vector
) u( v. N& j5 X0 Q H(k, = exp(-1j*omega(k)*tao);
2 N0 K( D6 I9 H8 W. a" B$ e7 fend
! L- K/ e" |: X( B0 M0 W8 @$ L9 m# `( T, p8 P! X7 n1 N+ M) a/ A
for i = 1:inc:length(x(:,1))-frameLength
( @+ l( U# u/ M8 H7 ?! a2 {6 x0 x: b1 t5 |( S$ b& D" u1 R% _
d = fft(bsxfun(@times, x(i:i+frameLength-1, ,hamming(frameLength)));
# z! i. X- C% z2 K/ R. g. |5 g! a3 U6 n: y( F
x_fft=bsxfun(@times, d(1:N/2+1, ,H);- O/ }, J, F2 ]. i& c& N
+ _1 g6 ~; T; r* D: x$ U' k
% phase transformed
* A. M, D @6 J( F" Q/ ]2 ?, G %x_fft = bsxfun(@rdivide, x_fft,abs(d(1:N/2+1, ));4 O- ^. v* r7 I
yf = sum(x_fft,2);6 P9 P/ ^( {7 p0 K5 x1 f- e, Z
Cf = [yf;conj(flipud(yf(2:N/2)))];7 ^, j$ V7 S. o, ]3 i
0 ~: ~( d. k: m( R; j# } % 恢复延时累加的信号
! _, k9 n$ o7 s1 p( Z6 k) o yds(i:i+frameLength-1) = yds(i:i+frameLength-1)+(ifft(Cf));, e1 @/ v. P* z2 R9 b5 M0 C4 a
; o+ e; n7 V1 M: Z
% 恢复各路对齐后的信号; {' C! C8 m8 f* Z' ^
xf = [x_fft;conj(flipud(x_fft(2:N/2, ))];
( n% G8 T; j( o8 M x1(i:i+frameLength-1, = x1(i:i+frameLength-1, +(ifft(xf));
/ q9 J' k1 R8 x1 y( rend
9 I, {1 p: V& y3 G6 L' A+ TDS = yds/Nele;
. ~5 J( e. \& M, f9 q/ ^- _+ R# i
! K% z5 u$ ~5 H/ u% Oend: ~# _& i; ]2 F" O4 L. J6 J4 F0 N3 s
然后遍历各个角度重复调用这个函数,测试实际录音数据,代码如下) f' h/ e% r, S" L, b
6 X4 u2 P- U0 g, m% Z
%% SRP Estimate of Direction of Arrival at Microphone Array
, h- D3 M% p6 t& M8 E) ^% Frequency-domain delay-and-sum test8 U E5 F" u/ m& F
% 2 y2 V7 C3 n5 F1 O, w7 X
%%: n- k H0 ]) P" D& B6 Y7 {
+ _! J6 ?6 S) l X% y% x = filter(Num,1,x0);
- y- e- m9 Q( i g4 U3 qc = 340.0;
1 W' f8 O( ^5 n# ~- T( x, ]6 ]# w3 b
, i3 f6 R3 J+ N7 y% XMOS circular microphone array radius
& b& z$ i: f% |% c# kd = 0.0420;. E# o, d: U3 d7 k" P
% more test audio file in ../../TestAudio/ folder
! i. f7 Y9 i/ j# spath = '../../TestAudio/XMOS/room_mic5-2/';
* P5 h0 U- p" z0 `- @3 B[s1,fs] = audioread([path,'音轨-2.wav']);
4 v& _2 C5 R" b( K! zs2 = audioread([path,'音轨-3.wav']);
" U6 w7 Q6 ~. ^" \9 }s3 = audioread([path,'音轨-4.wav']);
7 D9 X4 K& L$ ?- zs4 = audioread([path,'音轨-5.wav']);: X! l" Q" i% H# g" a
s5 = audioread([path,'音轨-6.wav']);
$ K8 a! ~" o, I8 Y; P3 ^1 C$ ms6 = audioread([path,'音轨-7.wav']);
6 C, p5 D) K- O0 g( s$ k1 \. Tsignal = [s1,s2,s3,s4,s5,s6];" C4 Z- o2 g* Y6 h, }4 Y
M = size(signal,2);8 X$ Z6 p5 ^ |4 _% `1 b
%%
6 S& r- l: b6 A( y) m2 D4 _/ Jt = 0;
7 @. o3 o9 O& C+ J+ n1 A7 u, \
& r! t7 j8 C/ L! _- _8 E/ D! m% minimal searching grid- m/ R# j* D9 r2 S6 I
step = 1;7 @! f2 p5 |1 v4 x8 |
- v$ u* t- _. n% V a( J
P = zeros(1,length(0:step:360-step));
6 \ r4 Q( Z; Ptic
- w# y$ A# `& ^0 ]9 sh = waitbar(0,'Please wait...');
9 V* K- Y8 I3 I. `. yfor i = 0:step:360-step% v3 X* i6 S# l; T% f% @
% Delay-and-sum beamforming- g* Y2 x7 V1 {; z9 Z) ]
[ DS, x1] = DelaySumURA(signal,fs,512,512,256,d,i/180*pi);
8 Y7 |1 s6 u# j) E( \ t = t+1;
/ S3 q+ m; a* S6 M! F7 x %beamformed output energy
% e3 z. h" ]0 E P(t) = DS'*DS;& }, \. A2 E" r
waitbar(i / length(step:360-step))
5 }4 x# b& d7 {: W2 zend
4 p* R& X, M7 A9 ~toc% L1 {9 X7 i6 W' e2 ?2 l7 \' p
close(h) 3 I5 {" F( N1 o& r! R
[m,index] = max(P);6 X* _: l6 T2 X/ s" V
figure,plot(0:step:360-step,P/max(P))" V' x' P7 P2 L9 O5 Z
ang = (index)*step
: a# e5 J. t. q6 m u0 R
3 @ \' t k- g/ _7 R" {$ U. C程序中用的是圆阵,可以进行二维方向角扫描,不过这里为了简便就固定了俯仰角,只扫描方位角,结果如下
* {2 N6 E4 e2 {: F# v # f0 z8 f( u- v, w
结果与预期相同 PHAT加权 与GCC-PHAT方法相同,这里也可以对幅度做归一化,只保留相位信息,使得到的峰值更明显,提高在噪声及混响环境下的性能 " p/ F: P* M2 D# `) S* u9 M
上面代码中加上这一句 $ `3 v0 ?$ ~. C; |
%x_fft = bsxfun(@rdivide, x_fft,abs(d(1:N/2+1, ));
/ b& v7 |! k7 F! _2 Q: s/ c测试同样的文件,结果如下
1 Y" H# P' {$ O# {7 d " h( y, T' ]+ o0 Y% D
对比可以看到,PHAT加权的方法性能更好2 ]+ a" B" I1 f% K9 w
参考6 L9 q. z) ~9 B u4 A. [
1.《SRP-PHAT-A High-Accuracy, Low-Latency Technique for Talker Localization in Reverberant Environments Using Microphone Arrays》
8 P1 A7 T3 k, _2 z( f2. 《传感器阵列波束优化设计与应用》
( R3 @% ?+ x2 S. o/ N1 @————————————————
6 y$ p! u8 |5 n2 f版权声明:本文为CSDN博主「373955482」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。4 ~- \. `: @% }
原文链接:https://blog.csdn.net/u010592995/article/details/81586504
* M! z7 H2 v$ {$ L& l1 Y# _- H+ L. y
+ \1 z( T8 B, o1 R, U
2 Z* Y* L5 d8 }3 E7 G) z
( M0 J$ ~+ ^* v a2 ^. ~
) o' F- E2 u- W4 D: ~ |