A new two-stage blind source separation (BSS) method for convolutive mixtures of speech is proposed, in which a single-input multiple-output (SIMO)-model-based independent component analysis (ICA) and a new SIMO-model-based binary masking are combined. SIMO-model-based ICA enables us to separate the mixed signals, not into monaural source signals but into SIMO-model-based signals from independent sources in their original form at the microphones. Thus, the separated signals of SIMO-model-based ICA can maintain the spatial qualities of each sound source. Owing to this attractive property, our novel SIMO-model-based binary masking can be applied to efficiently remove the residual interference components after SIMO-model-based ICA. The experimental results reveal that the separation performance can be considerably improved by the proposed method compared with that achieved by conventional BSS methods. In addition, the real-time implementation of the proposed BSS is illustrated.
References
-
Haykin S (ed.), Unsupervised Adaptive Filtering (John Wiley & Sons, New York, NY, USA, 2000)
-
JF Cardoso, Eigenstructure of the 4th-order cumulant tensor with application to the blind source separation problem. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '89), May 1989, Glasgow, UK, 2109–2112
-
C Jutten, J Herault, Blind separation of sources, part I: an adaptive algorithm based on neuromimetic architecture. Signal Processing 24(1), 1–10 (1991). Publisher Full Text
-
P Comon, Independent component analysis. A new concept? Signal Processing 36(3), 287–314 (1994). Publisher Full Text
-
AJ Bell, TJ Sejnowski, An information-maximization approach to blind separation and blind deconvolution. Neural Computation 7(6), 1129–1159 (1995). PubMed Abstract | Publisher Full Text
-
T-W Lee, Independent Component Analysis (Kluwer Academic, Norwell, Mass, USA, 1998)
-
P Smaragdis, Blind separation of convolved mixtures in the frequency domain. Neurocomputing 22(1–3), 21–34 (1998)
-
S Ikeda, N Murata, A method of ICA in time-frequency domain. Proceedings of International Workshop on Independent Component Analysis and Blind Signal Separation (ICA '99), January 1999, Aussions, France, 365–371
-
L Parra, C Spence, Convolutive blind separation of non-stationary sources. IEEE Transactions on Speech and Audio Processing 8(3), 320–327 (2000). Publisher Full Text
-
H Saruwatari, S Kurita, K Takeda, F Itakura, T Nishikawa, K Shikano, Blind source separation combining independent component analysis and beamforming. EURASIP Journal on Applied Signal Processing 2003(11), 1135–1146 (2003). Publisher Full Text
-
T Nishikawa, H Saruwatari, K Shikano, Blind source separation of acoustic signals based on multistage ICA combining frequency-domain ICA and time-domain ICA. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences E86-A(4), 846–858 (2003)
-
T Takatani, T Nishikawa, H Saruwatari, K Shikano, High-fidelity blind separation of acoustic signals using SIMO-model-based ICA with information-geometric learning. Proceedings of International Workshop on Acoustic Echo and Noise Control (IWAENC '03), September 2003, Kyoto, Japan, 251–254 (also submitted to IEEE Transactions on Speech and Audio Processing)
-
D Kolossa, R Orglmeister, Nonlinear postprocessing for blind speech separation. Proceedings of 5th International Workshop on Independent Component Analysis and Blind Signal Separation (ICA '04), September 2004, Granada, Spain, 832–839
-
R Lyon, A computational model of binaural localization and separation. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '83), April 1983, Boston, Mass, USA, 1148–1151
-
N Roman, DL Wang, GJ Brown, Speech segregation based on sound localization. Proceedings of the International Joint Conference on Neural Networks (IJCNN '01), July 2001, Washington, DC, USA 4, 2861–2866 PubMed Abstract | Publisher Full Text
-
M Aoki, M Okamoto, S Aoki, H Matsui, T Sakurai, Y Kaneda, Sound source segregation based on estimating incident angle of each frequency component of input signals acquired by multiple microphones. Acoustical Science and Technology 22(2), 149–157 (2001). Publisher Full Text
-
H Sawada, R Mukai, S Araki, S Makino, Polar coordinate based nonlinear function for frequency-domain blind source separation. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences E86-A(3), 590–596 (2003)
-
H Saruwatari, T Kawamura, T Nishikawa, K Shikano, Fast-convergence algorithm for blind source separation based on array signal processing. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences E86-A(4), 286–291 (2003)
-
H Saruwatari, T Kawamura, T Nishikawa, A Lee, K Shikano, Blind source separation based on a fast-convergence algorithm combining ICA and beamforming. IEEE Transactions on Speech and Audio Processing 14(2), 666–678 (2006)
-
S Rickard, Ö Yilmaz, On the approximate W-disjoint orthogonality of speech. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '02), May 2002, Orlando, Fla, USA 1, 529–532
-
T Takatani, S Ukai, T Nishikawa, H Saruwatari, K Shikano, A self-generator method for initial filters of SIMO-ICA applied to blind separation of binaural sound mixtures. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences E88-A(7), 1673–1682 (2005). Publisher Full Text
-
A Poularikas, The Handbook of Formulas and Tables for Signal Processing (CRC Press, Boca Raton, Fla, USA, 1999)
-
R Mukai, H Sawada, S Araki, S Makino, Blind source separation for moving speech signals using blockwise ICA and residual crosstalk subtraction. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences E87-A(8), 1941–1948 (2004)
-
H Buchner, R Aichner, W Kellermann, A generalization of blind source separation algorithms for convolutive mixtures based on second-order statistics. IEEE Transactions on Speech and Audio Processing 13(1), 120–134 (2005)
-
T Kobayashi, S Itabashi, S Hayashi, T Takezawa, ASJ continuous speech corpus for research. The Journal of The Acoustic Society of Japan 48(12), 888–893 (1992)
-
JJR Deller, JHL Hansen, JG Proakis, Discrete-Time Processing of Speech Signals (Wiley-IEEE Press, New York, NY, USA, 2000)
-
K Itou, M Yamamoto, K Takeda, et al. JNAS: Japanese speech corpus for large vocabulary continuous speech recognition research. The Journal of The Acoustic Society of Japan 20(3), 199–206 (1999)
-
A Lee, T Kawahara, K Takeda, K Shikano, A new phonetic tied-mixture model for efficient decoding. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '00), June 2000, Istanbul, Turkey 3, 1269–1272
-
SB Davis, P Mermelstein, "Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on Acoustics, Speech, and Signal Processing 28(4), 357–366 (1980). Publisher Full Text
-
A Lee, T Kawahara, K Shikano, Julius—an open source real-time large vocabulary recognition engine. Proceedings of 7th European Conference on Speech Communication and Technology (EUROSPEECH '01), September 2001, Aalborg, Danemark, 1691–1694
-
M Cooke, P Green, L Josifovski, A Vizinho, Robust automatic speech recognition with missing and unreliable acoustic data. Speech Communication 34(3), 267–285 (2001). Publisher Full Text
-
D Kolossa, A Klimas, R Orglmeister, Separation and robust recognition of noisy, convolutive speech mixtures using time-frequency masking and missing data techniques. Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA '05), October 2005, New Paltz, NY, USA, 82–85
-
A Cichocki, S Amari, Adaptive Blind Signal and Image Processing: Learning Algorithms and Applications (John Wiley & Sons, West Sussex, UK, 2002)
-
S Choi, S Amari, A Cichocki, R Liu, Natural gradient learning with a nonholonomic constraint for blind deconvolution of multiple channels. Proceedings of 1st International Workshop on Independent Component Analysis and Blind Source Separation (ICA '99), January 1999, Aussois, France, 371–376
-
T Nishikawa, H Saruwatari, K Shikano, Stable learning algorithm for blind separation of temporally correlated acoustic signals combining multistage ICA and linear prediction. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences E86-A(8), 2028–2036 (2003)