IEEE/ACM Transactions on Audio Speech and Language Processing, cilt.29, ss.2296-2309, 2021 (SCI-Expanded)
© 2014 IEEE.Recording multiple sound sources in a reverberant environment results in convolutive mixtures. Sound sources can be extracted from microphone array recordings of such mixtures using acoustic source separation techniques. Acoustic source separation using recordings obtained from rigid spherical microphone arrays (RSMA) benefit from the representation of sound fields as series of spherical harmonics. More specifically, RSMAs afford increased flexibility in acoustic beamforming and spatial filtering. We propose a data-driven DOA estimation and acoustic source separation method based on a dictionary-based sparse decomposition of sound fields. The proposed method involves identifying the time-frequency bins with contributions from a single source only and those with sensor noise or diffuse sound field components. The former set of bins is used in DOA estimation and beamforming in the sparse decomposition domain. The latter set is used to calculate the diffuse field covariance matrix used in Wiener post-filtering to improve the source separation performance further. We demonstrate the utility of the proposed method via extensive objective and subjective evaluations.