Conference paper

KARAFIÁT Martin, BURGET Lukáš, GRÉZL František, VESELÝ Karel and ČERNOCKÝ Jan. Multilingual Region-Dependent Transforms. In: Proceedings of the 41th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2016), 2016. Shanghai: IEEE Signal Processing Society, 2016, pp. 5430-5434. ISBN 978-1-4799-9988-0.
Publication language:english
Original title:Multilingual Region-Dependent Transforms
Title (cs):Multilingvální transformace závislé na regionech
Pages:5430-5434
Proceedings:Proceedings of the 41th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2016), 2016
Conference:41th IEEE International Conference on Acoustics, Speech and Signal Processing
Place:Shanghai, CN
Year:2016
ISBN:978-1-4799-9988-0
Publisher:IEEE Signal Processing Society
URL:http://www.fit.vutbr.cz/research/groups/speech/publi/2016/karafiat_icassp2016_0005430.pdf [PDF]
Files: 
+Type Name Title Size Last modified
iconkarafiat_icassp2016_0005430.pdf136 KB2017-03-01 18:23:18
^ Select all
With selected:
Keywords
Automatic speech recognition, Region-Dependent Transforms, Multilingual speech recognition, Feedforward neural networks
Annotation
This paper presented our further steps in the development of a feature extraction scheme easily transferable to a new language with severely limited training data.
Abstract
In recent years, trained feature extraction (FE) schemes based on neural networks have replaced or complemented traditional approaches in top performing systems. This paper deals with FE in multilingual scenarios with a target language with low amount of transcribed data. Continuing our previous work on multilingual training of Stacked Bottle-Neck Neural Network FE schemes, we concentrate on improving the discriminatively trained Region- Dependent Transforms. We show that multilingual training of RDT can be implemented by merging statistics from several languages. In our case we used up to 11 source languages to build a FE which generalize well for a new language. This allows us to build a strong bootstrapping model for the final ASR system. The results are produced on IARPA Babel data.
BibTeX:
@INPROCEEDINGS{
   author = {Martin Karafi{\'{a}}t and Luk{\'{a}}{\v{s}} Burget and
	Franti{\v{s}}ek Gr{\'{e}}zl and Karel Vesel{\'{y}} and Jan
	{\v{C}}ernock{\'{y}}},
   title = {Multilingual Region-Dependent Transforms},
   pages = {5430--5434},
   booktitle = {Proceedings of the 41th IEEE International Conference on
	Acoustics, Speech and Signal Processing (ICASSP 2016), 2016},
   year = {2016},
   location = {Shanghai, CN},
   publisher = {IEEE Signal Processing Society},
   ISBN = {978-1-4799-9988-0},
   language = {english},
   url = {http://www.fit.vutbr.cz/research/view_pub.php?id=11146}
}

Your IPv4 address: 54.198.0.187
Switch to IPv6 connection

DNSSEC [dnssec]