Publication Details

Confidence estimation, OOV detection and language ID using phone-to-word transduction and phone-level alignments

WHITE Christopher, ZWEIG Geoffrey, BURGET Lukáš, SCHWARZ Petr and HEŘMANSKÝ Hynek. Confidence estimation, OOV detection and language ID using phone-to-word transduction and phone-level alignments. In: Proc. 2008 IEEE International Conference on Acoustics, Speech, and Signal Processing. Las Vegas: IEEE Signal Processing Society, 2008, p. 4. ISBN 1-4244-1484-9.
Czech title
Odhad spolehlivosti, detekce OOV a identifikace jazyka pomocí transducerů převádějících fonémy na slova a fonetických zarovnání
Type
conference paper
Language
english
Authors
White Christopher (JHU)
Zweig Geoffrey (MSR)
Burget Lukáš, doc. Ing., Ph.D. (DCGM FIT BUT)
Schwarz Petr, Ing., Ph.D. (DCGM FIT BUT)
Heřmanský Hynek, prof. Ing., Dr.Eng. (DCGM FIT BUT)
URL
Keywords

speech recognition

Abstract

the paper is on confidence estimation, OOV detection and language ID using phone-to-word transduction and phone-level alignments

Annotation

Automatic Speech Recognition (ASR) systems continue to make errors during search when handling various phenomena including noise, pronunciation variation, and out of vocabulary (OOV) words. Predicting the probability that a word is incorrect can prevent the error from propagating and perhaps allow the system to recover. This paper addresses the problem of detecting errors and OOVs for read Wall Street Journal speech when the word error rate (WER) is very low. It augments a traditional confidence estimate by introducing two novel methods: phone-level comparison using Multi-String Alignment (MSA) and word-level comparison using phone-to-word transduction. We show that features from phone and word string comparisons can be added to a standard maximum entropy framework thereby substantially improving performance in detecting both errors and OOVs. Additionally we show an extension to detecting English and accented English for the Language Identification (LID) task.

Published
2008
Pages
4
Proceedings
Proc. 2008 IEEE International Conference on Acoustics, Speech, and Signal Processing
Conference
33rd International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Las Vegas, US
ISBN
1-4244-1484-9
Publisher
IEEE Signal Processing Society
Place
Las Vegas, US
BibTeX
@INPROCEEDINGS{FITPUB8722,
   author = "Christopher White and Geoffrey Zweig and Luk\'{a}\v{s} Burget and Petr Schwarz and Hynek He\v{r}mansk\'{y}",
   title = "Confidence estimation, OOV detection and language ID using phone-to-word transduction and phone-level alignments",
   pages = 4,
   booktitle = "Proc. 2008 IEEE International Conference on Acoustics, Speech, and Signal Processing",
   year = 2008,
   location = "Las Vegas, US",
   publisher = "IEEE Signal Processing Society",
   ISBN = "1-4244-1484-9",
   language = "english",
   url = "https://www.fit.vut.cz/research/publication/8722"
}
Back to top