Ing. Lukáš Burget, Ph.D.

Kombrink, S., Mikolov, T., Karafiát, M., Burget, L.: Improving Language Models for ASR Using Translated In-domain Data, In: Proceedings of 2012 IEEE International Conference on Acoustics, Speech and Signal Processing, Kyoto, JP, IEEESP, 2012, s. 4405-4408, ISBN 978-1-4673-0044-5
Jazyk publikace:angličtina
Název publikace:Improving Language Models for ASR Using Translated In-domain Data
Název (cs):Vylepšení jazykových modelů pro rozpoznávání řeči pomocí přeložených dat z cílové oblasti
Strany:4405-4408
Sborník:Proceedings of 2012 IEEE International Conference on Acoustics, Speech and Signal Processing
Konference:The 37th International Conference on Acoustics, Speech, and Signal Processing
Místo vydání:Kyoto, JP
Rok:2012
ISBN:978-1-4673-0044-5
Vydavatel:IEEE Signal Processing Society
URL:http://www.fit.vutbr.cz/research/groups/speech/publi/2012/kombrink_icassp2012_0004405.pdf [PDF]
Klíčová slova
Low Resource ASR, Language Modeling, Machine Translation
Anotace
Tento článek pojednává o vylepšení jazykových modelů pro rozpoznávání řeči pomocí přeložených dat z cílové oblasti.
Abstrakt
Acquisition of in-domain training data to build speech recognition systems for under-resourced languages can be a costly, time-demanding and tedious process. In this work, we propose the use of machine translation to translate English transcripts of telephone speech into Czech language in order to improve a Czech CTS speech recognition system. The translated transcripts are used as additional language model training data in a scenario where the baseline language model is trained on off- and close-domain data only. We report perplexities, OOV and word error rates and examine different data sets and translators on their suitability for the described task.
BibTeX:
@INPROCEEDINGS{
   author = {Stefan Kombrink and Tomáš Mikolov and Martin Karafiát and
	Lukáš Burget},
   title = {Improving Language Models for ASR Using Translated In-domain
	Data},
   pages = {4405--4408},
   booktitle = {Proceedings of 2012 IEEE International Conference on
	Acoustics, Speech and Signal Processing},
   year = {2012},
   location = {Kyoto, JP},
   publisher = {IEEE Signal Processing Society},
   ISBN = {978-1-4673-0044-5},
   language = {english},
   url = {http://www.fit.vutbr.cz/research/view_pub.php?id=9927}
}

Vaše IPv4 adresa: 107.22.156.205
Přepnout na IPv6 spojení

DNSSEC [dnssec]