PCA-Based Speech Enhancement for Distorted Speech Recognition

doi:10.4304/jmm.2.5.13-18

Journal of Multimedia, Vol 2, No 5 (2007), 13-18, Sep 2007

doi:10.4304/jmm.2.5.13-18

PCA-Based Speech Enhancement for Distorted Speech Recognition

Tetsuya Takiguchi, Yasuo Ariki

Abstract

We investigated a robust speech feature extraction method using kernel PCA (Principal Component Analysis) for distorted speech recognition. Kernel PCA has been suggested for various image processing tasks requiring an image model, such as denoising, where a noise-free image is constructed from a noisy input image. Much research for robust speech feature extraction has been done, but it remains difficult to completely remove additive or convolution noise (distortion). The most commonly used noise-removal techniques are based on the spectraldomain operation, and then for speech recognition, the MFCC (Mel Frequency Cepstral Coefficient) is computed, where DCT (Discrete Cosine Transform) is applied to the mel-scale filter bank output. This paper describes a new PCA-based speech enhancement algorithm using kernel PCA instead of DCT, where the main speech element is projected onto low-order features, while the noise or distortion element is projected onto high-order features. Its effectiveness is confirmed by word recognition experiments on distorted speech.

Keywords

kernel PCA, distorted speech, feature extraction, speech enhancement

References

Full Text: PDF

Journal of Multimedia (JMM, ISSN 1796-2048)

Username
Password
Remember me