{Interesting Reads} Improving Spoken Document Retrieval by Unsupervised Language Model Adaptation [contd.]

MsgBottle_icon_Blue-01KMDI is made up of a community of interdisciplinary studies.  We engage with the community around us to advance the study and creation of knowledge media.  Every so often we will share a link to an article that has been written in relation to Knowledge Media topics like Big Data, Knowledge Translation, Healthcare or Education systems.

The following piece is a conference paper put together by Robert Herms, Marc Ritter, Thomas Wilhelm-Stein and Maximilian Eibl, titled, “Improving Spoken Document Retrieval by Unsupervised Language Model Adaption Using Utterance-Based Web Search.”

Abstract:  Information retrieval systems facilitate the search for annotated audiovisual documents from different corpora. One of the main problems is to determine domain-specific vocabulary like names, brands, technical terms etc. by using generic language models (LM) especially in broadcast news.

Our approach consists of two steps to overcome the out-of-vocabulary (OOV) problem to improve the spoken document retrieval performance. Therefore, we first segment the amount of the resulting transcripts of a speech recognizer into blocks with fixed sizes of utterances. Keywords are extracted from each utterance for the search of web resources in an unsupervised manner in order to get adaptation data. These data is used to perform the adaptation of a base phonetic dictionary and a base LM within each block.

The second step comprises the utilization of a certain adapted dictionary and LM in the speech recognizer to improve the vocabulary coverage and to regard the perplexity for a corresponding block at once. We evaluate this strategy on a data set of summarized German broadcast news. Our experimental results show improvements of up to 11.7% for MAP of 18 different topics and 7.5% of WER in comparison to the base LM.

Check out the paper here.