This article is part of the series Multisensor Processing for Signal Extraction and Applications.

Open Access Research Article

Blind Separation of Acoustic Signals Combining SIMO-Model-Based Independent Component Analysis and Binary Masking

Yoshimitsu Mori1*, Hiroshi Saruwatari1, Tomoya Takatani1, Satoshi Ukai1, Kiyohiro Shikano1, Takashi Hiekata2, Youhei Ikeda2, Hiroshi Hashimoto2 and Takashi Morita2

Author Affiliations

1 Graduate School of Information Science, Nara Institute of Science and Technology, Ikoma 630-0192, Japan

2 Kobe Steel, Ltd., Kobe 651-2271, Japan

For all author emails, please log on.

EURASIP Journal on Advances in Signal Processing 2006, 2006:034970  doi:10.1155/ASP/2006/34970


The electronic version of this article is the complete one and can be found online at: http://asp.eurasipjournals.com/content/2006/1/034970


Received: 1 January 2006
Revisions received: 22 June 2006
Accepted: 22 June 2006
Published: 12 September 2006

© 2006 Mori et al.

This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

A new two-stage blind source separation (BSS) method for convolutive mixtures of speech is proposed, in which a single-input multiple-output (SIMO)-model-based independent component analysis (ICA) and a new SIMO-model-based binary masking are combined. SIMO-model-based ICA enables us to separate the mixed signals, not into monaural source signals but into SIMO-model-based signals from independent sources in their original form at the microphones. Thus, the separated signals of SIMO-model-based ICA can maintain the spatial qualities of each sound source. Owing to this attractive property, our novel SIMO-model-based binary masking can be applied to efficiently remove the residual interference components after SIMO-model-based ICA. The experimental results reveal that the separation performance can be considerably improved by the proposed method compared with that achieved by conventional BSS methods. In addition, the real-time implementation of the proposed BSS is illustrated.

References

  1. Haykin S (ed.), Unsupervised Adaptive Filtering (John Wiley & Sons, New York, NY, USA, 2000)

  2. JF Cardoso, Eigenstructure of the 4th-order cumulant tensor with application to the blind source separation problem. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '89), May 1989, Glasgow, UK, 2109–2112

  3. C Jutten, J Herault, Blind separation of sources, part I: an adaptive algorithm based on neuromimetic architecture. Signal Processing 24(1), 1–10 (1991). Publisher Full Text OpenURL

  4. P Comon, Independent component analysis. A new concept? Signal Processing 36(3), 287–314 (1994). Publisher Full Text OpenURL

  5. AJ Bell, TJ Sejnowski, An information-maximization approach to blind separation and blind deconvolution. Neural Computation 7(6), 1129–1159 (1995). PubMed Abstract | Publisher Full Text OpenURL

  6. T-W Lee, Independent Component Analysis (Kluwer Academic, Norwell, Mass, USA, 1998)

  7. P Smaragdis, Blind separation of convolved mixtures in the frequency domain. Neurocomputing 22(1–3), 21–34 (1998)

  8. S Ikeda, N Murata, A method of ICA in time-frequency domain. Proceedings of International Workshop on Independent Component Analysis and Blind Signal Separation (ICA '99), January 1999, Aussions, France, 365–371

  9. L Parra, C Spence, Convolutive blind separation of non-stationary sources. IEEE Transactions on Speech and Audio Processing 8(3), 320–327 (2000). Publisher Full Text OpenURL

  10. H Saruwatari, S Kurita, K Takeda, F Itakura, T Nishikawa, K Shikano, Blind source separation combining independent component analysis and beamforming. EURASIP Journal on Applied Signal Processing 2003(11), 1135–1146 (2003). Publisher Full Text OpenURL

  11. T Nishikawa, H Saruwatari, K Shikano, Blind source separation of acoustic signals based on multistage ICA combining frequency-domain ICA and time-domain ICA. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences E86-A(4), 846–858 (2003)

  12. T Takatani, T Nishikawa, H Saruwatari, K Shikano, High-fidelity blind separation of acoustic signals using SIMO-model-based ICA with information-geometric learning. Proceedings of International Workshop on Acoustic Echo and Noise Control (IWAENC '03), September 2003, Kyoto, Japan, 251–254 (also submitted to IEEE Transactions on Speech and Audio Processing) OpenURL

  13. D Kolossa, R Orglmeister, Nonlinear postprocessing for blind speech separation. Proceedings of 5th International Workshop on Independent Component Analysis and Blind Signal Separation (ICA '04), September 2004, Granada, Spain, 832–839

  14. R Lyon, A computational model of binaural localization and separation. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '83), April 1983, Boston, Mass, USA, 1148–1151

  15. N Roman, DL Wang, GJ Brown, Speech segregation based on sound localization. Proceedings of the International Joint Conference on Neural Networks (IJCNN '01), July 2001, Washington, DC, USA 4, 2861–2866 PubMed Abstract | Publisher Full Text OpenURL

  16. M Aoki, M Okamoto, S Aoki, H Matsui, T Sakurai, Y Kaneda, Sound source segregation based on estimating incident angle of each frequency component of input signals acquired by multiple microphones. Acoustical Science and Technology 22(2), 149–157 (2001). Publisher Full Text OpenURL

  17. H Sawada, R Mukai, S Araki, S Makino, Polar coordinate based nonlinear function for frequency-domain blind source separation. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences E86-A(3), 590–596 (2003)

  18. H Saruwatari, T Kawamura, T Nishikawa, K Shikano, Fast-convergence algorithm for blind source separation based on array signal processing. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences E86-A(4), 286–291 (2003)

  19. H Saruwatari, T Kawamura, T Nishikawa, A Lee, K Shikano, Blind source separation based on a fast-convergence algorithm combining ICA and beamforming. IEEE Transactions on Speech and Audio Processing 14(2), 666–678 (2006)

  20. S Rickard, Ö Yilmaz, On the approximate W-disjoint orthogonality of speech. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '02), May 2002, Orlando, Fla, USA 1, 529–532

  21. T Takatani, S Ukai, T Nishikawa, H Saruwatari, K Shikano, A self-generator method for initial filters of SIMO-ICA applied to blind separation of binaural sound mixtures. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences E88-A(7), 1673–1682 (2005). Publisher Full Text OpenURL

  22. A Poularikas, The Handbook of Formulas and Tables for Signal Processing (CRC Press, Boca Raton, Fla, USA, 1999)

  23. R Mukai, H Sawada, S Araki, S Makino, Blind source separation for moving speech signals using blockwise ICA and residual crosstalk subtraction. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences E87-A(8), 1941–1948 (2004)

  24. H Buchner, R Aichner, W Kellermann, A generalization of blind source separation algorithms for convolutive mixtures based on second-order statistics. IEEE Transactions on Speech and Audio Processing 13(1), 120–134 (2005)

  25. T Kobayashi, S Itabashi, S Hayashi, T Takezawa, ASJ continuous speech corpus for research. The Journal of The Acoustic Society of Japan 48(12), 888–893 (1992)

  26. JJR Deller, JHL Hansen, JG Proakis, Discrete-Time Processing of Speech Signals (Wiley-IEEE Press, New York, NY, USA, 2000)

  27. K Itou, M Yamamoto, K Takeda, et al. JNAS: Japanese speech corpus for large vocabulary continuous speech recognition research. The Journal of The Acoustic Society of Japan 20(3), 199–206 (1999)

  28. A Lee, T Kawahara, K Takeda, K Shikano, A new phonetic tied-mixture model for efficient decoding. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '00), June 2000, Istanbul, Turkey 3, 1269–1272

  29. SB Davis, P Mermelstein, "Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on Acoustics, Speech, and Signal Processing 28(4), 357–366 (1980). Publisher Full Text OpenURL

  30. A Lee, T Kawahara, K Shikano, Julius—an open source real-time large vocabulary recognition engine. Proceedings of 7th European Conference on Speech Communication and Technology (EUROSPEECH '01), September 2001, Aalborg, Danemark, 1691–1694

  31. M Cooke, P Green, L Josifovski, A Vizinho, Robust automatic speech recognition with missing and unreliable acoustic data. Speech Communication 34(3), 267–285 (2001). Publisher Full Text OpenURL

  32. D Kolossa, A Klimas, R Orglmeister, Separation and robust recognition of noisy, convolutive speech mixtures using time-frequency masking and missing data techniques. Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA '05), October 2005, New Paltz, NY, USA, 82–85

  33. A Cichocki, S Amari, Adaptive Blind Signal and Image Processing: Learning Algorithms and Applications (John Wiley & Sons, West Sussex, UK, 2002)

  34. S Choi, S Amari, A Cichocki, R Liu, Natural gradient learning with a nonholonomic constraint for blind deconvolution of multiple channels. Proceedings of 1st International Workshop on Independent Component Analysis and Blind Source Separation (ICA '99), January 1999, Aussois, France, 371–376

  35. T Nishikawa, H Saruwatari, K Shikano, Stable learning algorithm for blind separation of temporally correlated acoustic signals combining multistage ICA and linear prediction. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences E86-A(8), 2028–2036 (2003)