Conference paper

DEORAS, A., MIKOLOV, T., KOMBRINK, S., KARAFIÁT, M. and KHUDANPUR, S. Variational Approximation of Long-span Language Models for LVCSR. In: Proceedings of the 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011. Praha: IEEE Signal Processing Society, 2011, pp. 5532-5535. ISBN 978-1-4577-0537-3.
Publication language:english
Original title:Variational Approximation of Long-span Language Models for LVCSR
Title (cs):Variační aproximace jazykových modelů s dlouhým kontextem pro LVCSR
Pages:5532-5535
Proceedings:Proceedings of the 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011
Conference:International Conference on Acoustics, Speech and Signal Processing 2011
Place:Praha, CZ
Year:2011
ISBN:978-1-4577-0537-3
Publisher:IEEE Signal Processing Society
URL:http://www.fit.vutbr.cz/research/groups/speech/publi/2011/deoras_icassp2011_5532.pdf [PDF]
Keywords
Recurrent Neural Network, Language Model, Variational Inference
Annotation
We have presented experimental evidence that (n-gram) variational approximations of long-span LMs yield greater accuracy in LVCSR than standard n-gram models estimated from the same training text.
Abstract
Long-span language models that capture syntax and semantics are seldom used in the first pass of large vocabulary continuous speech recognition systems due to the prohibitive search-space of sentencehypotheses. Instead, an N-best list of hypotheses is created using tractable n-gram models, and rescored using the long-span models. It is shown in this paper that computationally tractable variational approximations of the long-span models are a better choice than standard n-gram models for first pass decoding. They not only result in a better first pass output, but also produce a lattice with a lower oracle word error rate, and rescoring the N-best list from such lattices with the long-span models requires a smaller N to attain the same accuracy. Empirical results on the WSJ, MIT Lectures, NIST 2007 Meeting Recognition and NIST 2001 Conversational Telephone Recognition data sets are presented to support these claims.
BibTeX:
@INPROCEEDINGS{
   author = {Anoop Deoras and Tomáš Mikolov and Stefan Kombrink and
	Martin Karafiát and Sanjeev Khudanpur},
   title = {Variational Approximation of Long-span Language Models for
	LVCSR},
   pages = {5532--5535},
   booktitle = {Proceedings of the 2011 IEEE International Conference on
	Acoustics, Speech, and Signal Processing, ICASSP 2011},
   year = {2011},
   location = {Praha, CZ},
   publisher = {IEEE Signal Processing Society},
   ISBN = {978-1-4577-0537-3},
   language = {english},
   url = {http://www.fit.vutbr.cz/research/view_pub.php.en?id=9659}
}

Your IPv4 address: 174.129.92.127
Switch to IPv6 connection

DNSSEC [dnssec]