DOA
# \% Y% z9 j& h( [! N 声源定位方法一般可分为三类,一种是基于TDOA的两步算法(two-stage algorithm),一种是基于空间谱估计如MUSIC等,还有就是基于beamforming的方法,也就是这里要介绍的可控波束响应(steered-response power),6 W3 k' {( M; `& z
3 I1 ]' h/ Y: o+ k6 I( r6 Ksteered-response power. b3 |8 j) J" f! i8 p& P
可控波束响应是利用波束形成(beamforming)的方法,对空间不同方向的声音进行增强,得到声音信号最强的方向就被认为是声源的方向。 m) a) k, M0 p7 A/ E" U# E
上一篇中简单介绍了麦克风阵列的背景知识,最简单的SRP就是利用延时-累加(delay-and-sum)的方法,寻找输出能量最大的方向。
# ?2 Q! Q: Z" {: N 其中,语音信号为宽带信号,因此需要做宽带波束形成,这里我们在频域实现
# H, ]- f: Q" V. j2 @$ O& ^% N% [& v9 e" A) M. w+ w4 Q; l
频域宽带波束形成
2 t3 O& L' J7 t4 k- \7 c8 E' \( r$ d 频域宽带波束形成可以归类为DFT波束形成器,结构如下图
* Q4 x# \9 Q. |+ K, z8 v* @4 U6 y d1 d8 n
8 i) W2 ~! {$ S频域处理也可以看做是子带处理(subband),DFT和IDFT的系数分别对应子带处理中的分析综合滤波器组,关于这一种解释,可参考《传感器阵列波束优化设计与应用》第六章。 频域宽带延时累加波束形成的基本过程就是信号分帧加窗->DFT->各频点相位补偿->IDFT 6 j- Y, y8 A3 u
代码实现如下 6 m2 D8 P& E( q% a5 B4 c
nction [ DS, x1] = DelaySumURA( x,fs,N,frameLength,inc,r,angle)
9 L4 [. Y d/ S. o4 r%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%2 u5 J7 z7 b- P8 l' N" z' S# o
%frequency-domain delay-sum beamformer using circular array5 ~4 H6 m% a9 m1 C- |
% r8 a, B k' {$ c* f: A
% input :! F. c1 k8 E3 R/ F4 v/ ^
% x : input signal ,samples * channel, u) k, ?# [: V
% fs: sample rate( j9 W) U9 E" i$ B) ?
% N : fft length,frequency bin number# l8 @2 k7 b' @9 x' M
%frameLength : frame length,usually same as N- l2 \9 y6 Q' [3 M' d
% inc : step increment
+ Q" _4 E$ Y" k `1 K% r : array element radius
8 U) \ p* b8 C5 _1 S: P E% angle : incident angle
7 e: e* W5 k* _; I! I- N. l" f%
8 a/ W. F" W3 {6 E5 C' s% output :
: [( c( X9 J$ t2 u" h% DS : delay-sum output
' X7 e) R/ M m0 B% S# X% x1 : presteered signal,same size as x& l0 D, o# f) `- ?* u4 \) J
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
- x" E' j0 |' I) p# v% q
' @1 c0 _: I& x, ~c = 340;
% T1 ]. F0 e* |& R( W; c" tNele = size(x,2);! ]0 E4 z0 l- E2 p/ l6 X
omega = zeros(frameLength,1);
& ]- i+ P* k6 P: V$ M. C" G' ?H = ones(N/2+1,Nele);1 N4 H. e) G* q" T: O1 w
' V9 ?# n, Z5 |, itheta = 90*pi/180; %固定一个俯仰角
- K1 |8 e6 Q F. o" z# x. n' _1 Agamma = [30 90 150 210 270 330]*pi/180;%麦克风位置+ i6 f9 E+ p( V9 R
tao = r*sin(theta)*cos(angle(1)-gamma)/c; %方位角 0 < angle <360
1 ?5 R) _) ]/ @* Ryds = zeros(length(x(:,1)),1);7 X; A8 [9 A: w2 T8 p8 J
x1 = zeros(size(x));
$ z' D/ L5 L& p" Y6 U+ X* X6 u. f1 k" ]+ W
% frequency bin weights- `- ^4 r6 x5 ~( R/ A s
% for k = 2:1:N/2+16 C) r. k4 v! P' j2 G' W. k# {/ V
for k = 1:1:5000*N/fs
- ?' D# Z/ v5 m4 X5 c! L. r+ B omega(k) = 2*pi*(k-1)*fs/N;
$ b8 G: r- R* c$ t0 o. M % steering vector: {2 O' X. e6 M# z/ u) R* U+ w+ f
H(k, = exp(-1j*omega(k)*tao);
+ Z* B1 x1 _5 Y- Aend. k- v; C+ [6 O/ u% I" i) C
- O9 ]9 |5 |3 A! @9 Y4 Wfor i = 1:inc:length(x(:,1))-frameLength" F, ?, N9 Q# L6 i
8 B% L' i% Y7 O C% g; s
d = fft(bsxfun(@times, x(i:i+frameLength-1,,hamming(frameLength)));8 }3 k4 R1 `0 R
T9 X+ n+ u; H( W
x_fft=bsxfun(@times, d(1:N/2+1,,H);
1 g, w1 D5 Z7 M) c, {7 C4 `4 H9 ?. S* K' O$ ^& D8 Q3 Q
% phase transformed/ F/ ?; a2 e+ A7 w! ]
%x_fft = bsxfun(@rdivide, x_fft,abs(d(1:N/2+1,)); G4 i6 J }6 T# f7 [' U
yf = sum(x_fft,2);
+ w! v3 `* ~# D# k$ m, E8 E, R6 O Cf = [yf;conj(flipud(yf(2:N/2)))];
1 M/ h8 s7 W% a- Q
4 {7 d. v' P! F" S3 r % 恢复延时累加的信号
0 e+ H% D0 q4 g( X3 q5 W, l$ u5 Q4 L yds(i:i+frameLength-1) = yds(i:i+frameLength-1)+(ifft(Cf));
6 ~1 y' X: C/ u- y* e6 ^* g2 Y. k9 f) `7 N- d- x. M1 d$ L
% 恢复各路对齐后的信号' O5 _8 H( A/ w' X0 [7 Q) _
xf = [x_fft;conj(flipud(x_fft(2:N/2,))];1 Y9 U3 j3 h$ Z( c6 v
x1(i:i+frameLength-1, = x1(i:i+frameLength-1,+(ifft(xf));* M$ [: r- S0 V2 z7 v+ Q. A
end
- H0 S$ B3 h/ d4 D' eDS = yds/Nele;
$ X7 e- W- @. }5 v6 H5 k/ {, W. V* m8 \: C D/ J/ U6 I/ y5 _
end
7 x& p: g- Z, o然后遍历各个角度重复调用这个函数,测试实际录音数据,代码如下
( Q6 Y+ Y( ]6 [& e* F" s/ r( d- t- T9 ~9 I; z8 _5 S' z5 r% [
%% SRP Estimate of Direction of Arrival at Microphone Array, D5 E0 I ^( m [: W3 i
% Frequency-domain delay-and-sum test. V+ [' |: d1 M7 Y4 J
%
& i! U F1 f. X! k8 x9 B. _- G%%: T' F, F- N, ?4 f, u" t
. [, Q+ [0 I; H& T. B% x = filter(Num,1,x0);1 e0 K6 O* P3 f3 n3 p2 E
c = 340.0;' n& Y6 v: D3 Y- t O
( C4 ?% E1 v: g) m0 @
% XMOS circular microphone array radius
! _- C4 P6 o% j$ K- b$ [; M# pd = 0.0420;- X" F5 }9 ]& \3 ^4 a* q; i
% more test audio file in ../../TestAudio/ folder: t' J. o- V6 W1 l' D/ ^5 F2 l
path = '../../TestAudio/XMOS/room_mic5-2/';
4 @- c3 |4 {2 _7 _* l[s1,fs] = audioread([path,'音轨-2.wav']);' B# U# c3 y: I1 \0 r
s2 = audioread([path,'音轨-3.wav']);
) w" C. M7 Q% ?8 C5 ss3 = audioread([path,'音轨-4.wav']);
6 C, R1 Z, S1 }s4 = audioread([path,'音轨-5.wav']);
2 b4 e; N5 }7 `/ d p/ U F$ m$ Ds5 = audioread([path,'音轨-6.wav']);
' |( o6 a% ^" a) y4 {: g( Ps6 = audioread([path,'音轨-7.wav']);% F7 i# b- E8 a' {* c2 s
signal = [s1,s2,s3,s4,s5,s6];" A3 _! F- ]. |% Q
M = size(signal,2);8 o6 W0 N+ A" o4 m: ^
%%
- C8 C, m5 O- U" ]" w0 {- Qt = 0;
5 \. d0 u) t6 a$ O E
3 G* I! \- Z1 e' _) W$ b0 F% minimal searching grid1 j' u3 |( S& f' W1 B x
step = 1;
m' t, I4 b( m9 H1 d9 Z6 H$ C) a/ O+ V, d0 V; f ~0 @
P = zeros(1,length(0:step:360-step));! h) Z% i0 Q! w
tic R! c# I! q& A3 F
h = waitbar(0,'Please wait...');
; N4 `: ~9 _( K: F+ B0 q" nfor i = 0:step:360-step
/ l6 @, J* B4 N! X % Delay-and-sum beamforming9 O1 v( a& T% I+ |
[ DS, x1] = DelaySumURA(signal,fs,512,512,256,d,i/180*pi);$ I$ k) }: t& }
t = t+1;6 [9 ~& `. f: Z6 n/ a
%beamformed output energy
, [. |; q( f: X# P7 [% ]% w P(t) = DS'*DS;0 p8 I; d" k7 A5 u
waitbar(i / length(step:360-step))$ u7 M' z0 \# d; v6 `
end8 P3 w8 L" P% H
toc
8 ~+ I" i. b/ Z2 s2 f; |close(h) # {; Y7 C$ q, b/ e r/ m
[m,index] = max(P);9 N( q, m, n; B% D8 I9 Y
figure,plot(0:step:360-step,P/max(P))
/ |$ U3 L' U8 eang = (index)*step
' O: p2 z }& Y4 P! S$ ?8 h1 ?- U6 {/ G; W9 V4 _. v# `
程序中用的是圆阵,可以进行二维方向角扫描,不过这里为了简便就固定了俯仰角,只扫描方位角,结果如下
7 ^: Q2 ?/ E' ~/ d1 S( i
/ E: w$ M+ V7 I; y3 i结果与预期相同 PHAT加权 与GCC-PHAT方法相同,这里也可以对幅度做归一化,只保留相位信息,使得到的峰值更明显,提高在噪声及混响环境下的性能 - ]0 I* g3 E0 k) r0 K% |
上面代码中加上这一句
" @5 f$ k+ F" R: a4 a%x_fft = bsxfun(@rdivide, x_fft,abs(d(1:N/2+1,));: D* c( A: L- l& ?) V) u! q
测试同样的文件,结果如下
3 p! y [' m7 ?; e2 X/ N8 z" u* y0 a* U% Z$ ~7 p
对比可以看到,PHAT加权的方法性能更好9 x% C% h0 a8 o# @" F1 J" @
参考$ @" V0 P" O+ {
1.《SRP-PHAT-A High-Accuracy, Low-Latency Technique for Talker Localization in Reverberant Environments Using Microphone Arrays》 & q3 E* e/ ~3 ~
2. 《传感器阵列波束优化设计与应用》 P# P) K0 ~" o
————————————————1 A/ a2 r. q1 `# _& G! y H
版权声明:本文为CSDN博主「373955482」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。
# d9 {& H! @ N, g! x1 l3 s1 M7 _9 Z原文链接:https://blog.csdn.net/u010592995/article/details/81586504
1 t; s/ j0 g8 ~; p- o! u( V
7 J+ O3 B: x! U% C
% ~2 D% j% P4 N& ~) R% @# e- @- d" ~6 o6 g
' ^, B4 k- L8 J+ n6 Y8 G
5 C1 h9 n( q0 }$ a9 F
|