| Deoras, A., Mikolov, T., Church, K.: A Fast Re-scoring Strategy to Capture Long-Distance Dependencies, In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing July 2011 Edinburgh, Scotland, UK, Edinburgh, GB, ACL, 2011, p. 1116-1127, ISBN 978-1-937284-11-4 | | Publication language: | english |
|---|
| Original title: | A Fast Re-scoring Strategy to Capture Long-Distance Dependencies |
|---|
| Title (cs): | Strategie pro rychlé reskórování se závislostmi pres dlouhé kontexty |
|---|
| Pages: | 1116-1127 |
|---|
| Proceedings: | Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing July 2011 Edinburgh, Scotland, UK |
|---|
| Conference: | Conference on Empirical Methods in Natural Language Processing |
|---|
| Place: | Edinburgh, GB |
|---|
| Year: | 2011 |
|---|
| ISBN: | 978-1-937284-11-4 |
|---|
| Publisher: | Association for Computational Linguistics |
|---|
| URL: | http://www.fit.vutbr.cz/research/groups/speech/publi/2011/deoras_emnlp2011_D11-1103.pdf [PDF] |
|---|
| URL: | http://www.aclweb.org/anthology-new/D/D11/D11-1103.pdf [PDF] |
|---|
| Keywords |
|---|
| language model, re-scoring strategy, recurrent neural network |
| Annotation |
|---|
| The paper describes novel approach to lattice rescoring with complex lanaguage models with long-distance dependencies, such as recurrent neural network language models. |
| Abstract |
|---|
| A re-scoring strategy is proposed that makes
it feasible to capture more long-distance dependencies
in the natural language. Two pass
strategies have become popular in a number
of recognition tasks such as ASR (automatic
speech recognition), MT (machine
translation) and OCR (optical character recognition).
The first pass typically applies a
weak language model (n-grams) to a lattice
and the second pass applies a stronger language
model to N-best lists. The stronger language
model is intended to capture more longdistance
dependencies. The proposed method
uses RNN-LM (recurrent neural network language
model), which is a long span LM, to rescore
word lattices in the second pass. A hill
climbing method (iterative decoding) is proposed
to search over islands of confusability
in the word lattice. An evaluation based on
Broadcast News shows speedups of 20 over
basic N-best re-scoring, and word error rate
reduction of 8% (relative) on a highly competitive
setup. |
| BibTeX: |
|---|
@INPROCEEDINGS{
author = {Anoop Deoras and Tomás Mikolov and Kenneth Church},
title = {A Fast Re-scoring Strategy to Capture Long-Distance
Dependencies},
pages = {1116--1127},
booktitle = {Proceedings of the 2011 Conference on Empirical Methods in
Natural Language Processing July 2011 Edinburgh, Scotland,
UK},
year = {2011},
location = {Edinburgh, GB},
publisher = {Association for Computational Linguistics},
ISBN = {978-1-937284-11-4},
language = {english},
url = {http://www.fit.vutbr.cz/research/view_pub.php?id=9687}
} |
|