Conference paper

GRÉZL František and KARAFIÁT Martin. Boosting Performance on Low-resource Languages by Standard Corpora: AN ANALYSIS. In: Proceeding of SLT 2016. San Diego: IEEE Signal Processing Society, 2016, pp. 629-636. ISBN 978-1-5090-4903-5.
Publication language:english
Original title:Boosting Performance on Low-resource Languages by Standard Corpora: AN ANALYSIS
Title (cs):Zlepšení úspěšnosti na jazycích s omezenými zdroji pomocí standardních řečových databází: analýza
Pages:629-636
Proceedings:Proceeding of SLT 2016
Conference:2016 IEEE Workshop on Spoken Language Technology
Place:San Diego, US
Year:2016
ISBN:978-1-5090-4903-5
Publisher:IEEE Signal Processing Society
URL:http://www.fit.vutbr.cz/research/groups/speech/publi/2016/grezl_slt2016_0000629.pdf [PDF]
Keywords
DNN topology, Stacked Bottle-neck, feature extraction, multilingual training, system porting, low resource
Annotation
In this paper, we have evaluated the multilingual techniques for single source-language scenario. Since it is hard to obtain coherent multilingual corpora usable for multilingual training, using single, well resourced, language instead is quite attractive.
Abstract
In this paper, we analyze the feasibility of using single wellresourced language - English - as a source language for multilingual techniques in context of Stacked Bottle-Neck tandem system. The effect of amount of data and number of tied-states in the source language on performance of ported system is evaluated together with different porting strategies. Generally, increasing data amount and level-of-detail both is positive. A greater effect is observed for increasing number of tied states. The modified neural network structure, shown useful for multilingual porting, was also evaluated with its specific porting procedure. Using original NN structure in combination with modified porting adapt-adapt strategy was fount as best. It achieves relative improvement 3.5-8.8% on variety of target languages. These results are comparable with using multilingual NNs pretrained on 7 languages.
BibTeX:
@INPROCEEDINGS{
   author = {Franti{\v{s}}ek Gr{\'{e}}zl and Martin Karafi{\'{a}}t},
   title = {Boosting Performance  on Low-resource Languages by Standard
	Corpora: AN ANALYSIS},
   pages = {629--636},
   booktitle = {Proceeding of SLT 2016},
   year = {2016},
   location = {San Diego, US},
   publisher = {IEEE Signal Processing Society},
   ISBN = {978-1-5090-4903-5},
   language = {english},
   url = {http://www.fit.vutbr.cz/research/view_pub.php?id=11311}
}

Your IPv4 address: 54.166.19.237
Switch to IPv6 connection

DNSSEC [dnssec]