DOA; X1 x5 i' {/ B6 p' P/ O, o
声源定位方法一般可分为三类,一种是基于TDOA的两步算法(two-stage algorithm),一种是基于空间谱估计如MUSIC等,还有就是基于beamforming的方法,也就是这里要介绍的可控波束响应(steered-response power),
5 w7 N- O+ q3 ]( v+ K( M# C7 K- ^$ n; i$ r
steered-response power
; A& T& g/ x: l1 H. b4 F 可控波束响应是利用波束形成(beamforming)的方法,对空间不同方向的声音进行增强,得到声音信号最强的方向就被认为是声源的方向。 3 X; R5 }2 P. }& d; @
上一篇中简单介绍了麦克风阵列的背景知识,最简单的SRP就是利用延时-累加(delay-and-sum)的方法,寻找输出能量最大的方向。
5 T3 d7 Q2 ]$ Q' R5 z$ B 其中,语音信号为宽带信号,因此需要做宽带波束形成,这里我们在频域实现/ i) v2 m- J4 c. W
^$ w7 {. j( I5 T" X
频域宽带波束形成1 p( g& x9 \0 A5 J9 H& ^
频域宽带波束形成可以归类为DFT波束形成器,结构如下图 6 V$ N# G [4 s. C
( J- ?- i. D8 b1 i
6 K) s3 _8 v& D# E) O1 W& g
频域处理也可以看做是子带处理(subband),DFT和IDFT的系数分别对应子带处理中的分析综合滤波器组,关于这一种解释,可参考《传感器阵列波束优化设计与应用》第六章。 频域宽带延时累加波束形成的基本过程就是信号分帧加窗->DFT->各频点相位补偿->IDFT a( \' P; B& L* }, [8 k2 ~
代码实现如下 / ?% c1 p; ~* s6 O
nction [ DS, x1] = DelaySumURA( x,fs,N,frameLength,inc,r,angle)
! O, C+ S: S! R5 Z4 ^2 g% v%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
! ^6 y) L: {0 t4 v%frequency-domain delay-sum beamformer using circular array; \: ?1 `1 R* { k: d
% # Z3 l2 @# K2 d0 Y, O: E# |
% input :
) k' Q! e4 \! }% x : input signal ,samples * channel& V& W0 M, Y6 c# e
% fs: sample rate
t% g6 B. V6 R, [* g9 @4 }3 R% N : fft length,frequency bin number9 q% D: A& u @& H, K* y& N2 k8 l+ p" |
%frameLength : frame length,usually same as N
; {; P7 I( o- Q: I! u% inc : step increment
4 _: \" D) z% k: K/ L" d& }, t: l% r : array element radius
8 n4 G, w( O0 L' w0 r% }$ y% angle : incident angle" p+ V- P# [# Q
%. Q/ G" u, C1 d$ ?# q8 ?
% output :
0 v+ ]0 H3 S; @1 B6 }% DS : delay-sum output: c( E* ]- X; w2 Q8 n/ V
% x1 : presteered signal,same size as x
4 s- ~$ b7 h+ a q. g%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
: R u+ \$ o$ X, A: ~; u, `3 x/ |0 y5 n3 O8 s$ Q1 a5 ?0 }) m
c = 340;
8 z- x4 X' G+ V+ D3 Q- hNele = size(x,2);# c* h! R+ J$ D9 r% N, O
omega = zeros(frameLength,1);
# G! Z0 C3 Y, vH = ones(N/2+1,Nele);, b( M% E2 B5 ^$ n5 A+ x5 m0 i
# _2 H* a' R3 J4 K; U
theta = 90*pi/180; %固定一个俯仰角
: H6 s9 o& p/ ^3 }% wgamma = [30 90 150 210 270 330]*pi/180;%麦克风位置
a* r7 }$ G' H# {& B1 itao = r*sin(theta)*cos(angle(1)-gamma)/c; %方位角 0 < angle <360$ w. E5 y3 \& y" y
yds = zeros(length(x(:,1)),1);& [" J+ G+ R" c& L A l* ]
x1 = zeros(size(x));' q3 D! ?9 ~8 @; p+ M, u
/ J# O: q. T* U( x% frequency bin weights
) l* _) r9 [# K5 w% for k = 2:1:N/2+1
% u2 q, B* V. @& M$ c/ e9 P0 }" Ifor k = 1:1:5000*N/fs1 J& a2 h E7 T" W
omega(k) = 2*pi*(k-1)*fs/N; : l3 X. z. I" A; N+ X- `
% steering vector O5 v* ~0 N7 X2 b$ j5 u, R, f
H(k, = exp(-1j*omega(k)*tao); r% _ }0 c6 l- a) }! m
end
/ A8 \/ w8 Z/ l4 x [9 F! L9 k0 t4 `8 a6 W9 O$ F2 A- F, h
for i = 1:inc:length(x(:,1))-frameLength
( A- |& u$ }0 M3 j o0 o! r! \- _5 E) I: {' o2 P2 @ @: ^" i
d = fft(bsxfun(@times, x(i:i+frameLength-1, ,hamming(frameLength)));
; x+ ^7 k' D% x/ N2 Y& D+ U% j& @4 E( M8 B: d2 q7 F" V( v
x_fft=bsxfun(@times, d(1:N/2+1, ,H);
( N, s1 |; }$ X$ D" x# U9 L$ r5 Q: L
% phase transformed
; J6 H5 a3 b1 r %x_fft = bsxfun(@rdivide, x_fft,abs(d(1:N/2+1, ));
9 V1 B# P. `, H8 e# F' O' P yf = sum(x_fft,2);
, p. M( w2 }) m4 c0 Z0 b/ K Cf = [yf;conj(flipud(yf(2:N/2)))];
: ?. u7 ?( ]& b: y1 V/ O' x. W3 d# r+ L
% 恢复延时累加的信号) A2 q0 v* ]# b
yds(i:i+frameLength-1) = yds(i:i+frameLength-1)+(ifft(Cf));& K* }5 v4 }! @' v$ S3 D5 {
' L, V- C( U/ Q0 | % 恢复各路对齐后的信号+ h6 h0 C5 Y( D! G
xf = [x_fft;conj(flipud(x_fft(2:N/2, ))];/ Y1 h% F* B: }' I4 J6 Y7 W
x1(i:i+frameLength-1, = x1(i:i+frameLength-1, +(ifft(xf));
/ X. t9 B: Q1 t, d8 e. qend% s3 Z! Z$ _9 A- k7 A7 l: ?; S2 _
DS = yds/Nele;
. X: {4 d. O/ z5 P/ e- k8 c' N2 l5 i5 G1 M5 \3 i$ {* j
end+ E* h: ^7 C7 I) O' B1 E
然后遍历各个角度重复调用这个函数,测试实际录音数据,代码如下
) |7 r& R9 M& C8 |7 g" {3 I9 p: A
% C, C/ Q% u6 X I" c# B2 N; X%% SRP Estimate of Direction of Arrival at Microphone Array7 n$ [3 c* ?- Q. }4 r9 Y
% Frequency-domain delay-and-sum test
! m- R9 ~8 _ n7 [6 }5 W% I$ O8 G7 D+ B- y& c {' p
%%" O1 \( N3 U& I+ P) _2 c1 `% C* c
. S" u9 {1 @$ P* {- x _8 @: F% x = filter(Num,1,x0); U0 L6 x( c c, j; Z& O( ]
c = 340.0;: q6 L% D) a7 G( _
3 f1 m! l: s# w) C5 v3 ?5 P; K. `% XMOS circular microphone array radius
$ L" Q& d. `" M, d$ H& Ud = 0.0420;3 o! C$ @, o4 ~4 K% l
% more test audio file in ../../TestAudio/ folder
# z- {$ m& P' x! a( Zpath = '../../TestAudio/XMOS/room_mic5-2/';
% {' v9 X J5 ]5 B U9 c) ^" J[s1,fs] = audioread([path,'音轨-2.wav']);
" y; ^4 o: l5 _0 _s2 = audioread([path,'音轨-3.wav']);
$ d, Y2 K. m7 H2 f0 B5 Y5 q Cs3 = audioread([path,'音轨-4.wav']);
z8 ^6 d" E) U# v, o9 ]s4 = audioread([path,'音轨-5.wav']);
8 [& B. k! k; M# X; `s5 = audioread([path,'音轨-6.wav']);# V2 c( }+ Q7 X2 W
s6 = audioread([path,'音轨-7.wav']);4 A3 a( z+ o( G [
signal = [s1,s2,s3,s4,s5,s6];
$ O! O2 c5 _' `! u7 A2 ~M = size(signal,2);
* h( L. R s( r1 M%%( l( |% U& V1 \3 ?4 z C8 A
t = 0;% \4 N8 ~) ]9 t
; h% _ D# Y) J# v4 g6 k; v
% minimal searching grid
" l/ O U, R6 t8 \step = 1;
l6 Y4 `# Z A
; y; M1 T" A- k+ Q& wP = zeros(1,length(0:step:360-step));
$ o+ @: N, r6 X2 |tic
/ k: U# h% Q3 x# g5 Z S' \4 L) Oh = waitbar(0,'Please wait...');
* v! p; L }1 F7 v! S M; Pfor i = 0:step:360-step
7 E2 u+ Y2 k0 U$ ]- r p % Delay-and-sum beamforming( Z4 }! |2 f* Q1 A$ ^ [& Y
[ DS, x1] = DelaySumURA(signal,fs,512,512,256,d,i/180*pi);9 Y* b! o" }9 \/ h
t = t+1;% ]+ z7 @: z$ w
%beamformed output energy* F) Z& B9 |- l- h! ]3 V- I. n
P(t) = DS'*DS;
' i; d' D! G; Y: U waitbar(i / length(step:360-step))( F) i. r* i1 D* O3 O0 |; N" n0 {
end9 D N$ t2 f5 J$ J+ B
toc" N! P% ?3 Z9 ^) h# R- ]% H
close(h) / g$ R1 X( B5 x% _# d
[m,index] = max(P);
# T+ ~/ u2 q# M4 c+ `& f7 Ofigure,plot(0:step:360-step,P/max(P))
8 U' x6 y% v$ o3 {: y* z1 |+ Mang = (index)*step! D1 o& ?; B6 {6 A4 Z# ]3 V
! g& @( l, L I7 Q) E5 Z* _
程序中用的是圆阵,可以进行二维方向角扫描,不过这里为了简便就固定了俯仰角,只扫描方位角,结果如下/ C/ P5 ^, i2 m2 j2 H0 z4 e$ d; V; c
* |/ A1 w5 M& U8 E% Y M
结果与预期相同 PHAT加权 与GCC-PHAT方法相同,这里也可以对幅度做归一化,只保留相位信息,使得到的峰值更明显,提高在噪声及混响环境下的性能
( _1 }. {7 e% {$ a( }# S6 a* o 上面代码中加上这一句 5 E' c; O3 _3 ~1 ~& S6 X
%x_fft = bsxfun(@rdivide, x_fft,abs(d(1:N/2+1, ));
' j1 _/ E( R4 ?测试同样的文件,结果如下 , |; S1 y6 w, `% @
% n) [3 \8 h p8 `
对比可以看到,PHAT加权的方法性能更好$ [2 F$ i2 I$ I7 ]4 X6 j: _; k
参考( t. E+ D1 h l3 k; A7 G
1.《SRP-PHAT-A High-Accuracy, Low-Latency Technique for Talker Localization in Reverberant Environments Using Microphone Arrays》 6 g; n* Z+ P( B+ X' D7 H% F
2. 《传感器阵列波束优化设计与应用》
! p0 x7 c o; i) K, p8 ]————————————————. S. _' O: W' F! O9 |
版权声明:本文为CSDN博主「373955482」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。
$ l' ?+ S0 r8 p7 }! f原文链接:https://blog.csdn.net/u010592995/article/details/81586504
: h* _: a+ K2 I" v# Y
& t: _1 J7 E' `$ F* Y3 `
2 f/ p8 |; _' }5 X9 T/ t
1 a9 z5 \" d1 o+ L- ]
8 J6 G$ r! X5 P' @
: e3 Q, N( v6 H$ ]6 e9 I* N |