Header menu link for other important links
Representation learning for spoken term detection
P.R. Reddy, , B. Yegnanarayana
Published in World Scientific Publishing Co. Pte. Ltd.
Pages: 633 - 662
Spoken Term Detection (STD), which refers to the task of searching for a user audio query in audio data is extremely significant for the management and monitoring of increasing volumes of audio data on the internet. It is affected by channel mismatch, speaker variability and differences in speaking mode/rate. Thus one of the main issues in STD is to devise a robust and speaker-invariant representation for the speech signal, so that the query and reference utterances can be matched in the new representation domain. In this chapter, the authors compare and contrast the supervised and unsupervised approaches to learn robust speech-specific representation for the STD task. Posterior representation of speech is used for developing the STD system with posterior features extracted in both supervised and unsupervised approaches. © 2017 by World Scientific Publishing Co. Pte. Ltd.
About the journal
JournalPattern Recognition and Big Data
PublisherWorld Scientific Publishing Co. Pte. Ltd.