Header menu link for other important links
X
Instantaneous frequency features for noise robust speech recognition
S. Nayak, D.B. Shashank, S. Bhati, K. Bramhendra,
Published in Institute of Electrical and Electronics Engineers Inc.
2019
Abstract
Analytic phase of the speech signal plays an important role in human speech perception, specially in the presence of noise. Generally, phase information is ignored in most of the recent speech recognition systems. In this paper, we illustrate the importance of analytic phase of the speech signal for noise robust automatic speech recognition. To avoid phase wrapping problem involved in the computation of analytic phase, features are extracted from instantaneous frequency (IF) which is time derivative of analytic phase. Deep neural network (DNN) based acoustic models are trained on clean speech using features extracted from the IF of speech signals. Robustness of IF features in combination with mel-frequency cepstral coefficients (MFCCs) was evaluated in varied noisy conditions. System combination using minimum Bayes risk decoding of IF features with MFCCs delivered absolute improvements of upto 13% over MFCC features alone for DNN based systems under noisy conditions. The impact of the system combination of magnitude and phase based features on different phonetic classes was studied under noisy conditions and was found to model both voiced and unvoiced phonetic classes efficiently. © 2019 IEEE.
About the journal
JournalData powered by Typeset25th National Conference on Communications, NCC 2019
PublisherData powered by TypesetInstitute of Electrical and Electronics Engineers Inc.