Conference paper

KARAFIÁT Martin, JANDA Miloš, ČERNOCKÝ Jan and BURGET Lukáš. Region Dependent Linear Transforms in Multilingual Speech Recognition. In: Proc. International Conference on Acoustics, Speech, and Signal Processing 2012. Kyoto: IEEE Signal Processing Society, 2012, pp. 4885-4888. ISBN 978-1-4673-0044-5.
Publication language:english
Original title:Region Dependent Linear Transforms in Multilingual Speech Recognition
Title (cs):Lineární transformace závislé na regionech v multilingválním rozpoznávání řeči
Pages:4885-4888
Proceedings:Proc. International Conference on Acoustics, Speech, and Signal Processing 2012
Conference:The 37th International Conference on Acoustics, Speech, and Signal Processing
Place:Kyoto, JP
Year:2012
ISBN:978-1-4673-0044-5
Publisher:IEEE Signal Processing Society
URL:http://www.fit.vutbr.cz/research/groups/speech/publi/2012/karafiat_icassp2012_0004885.pdf [PDF]
Keywords
HLDA, Region Dependent Transforms, Minimum Phone Error, fMPE, multilingual speech recognition
Annotation
In today's speech recognition systems, linear or nonlinear transformations are usually applied to post-process speech features forming input to HMM based acoustic models. In this work, we experiment with three popular transforms: HLDA,MPE-HLDA and Region Dependent Linear Transforms (RDLT), which are trained jointly with the acoustic model to extract maximum of the discriminative information from the raw features and to represent it in a form suitable for the following GMM-HMM based acoustic model. We focus on multi-lingual environments, where limited resources are available for training recognizers of many languages. Using data from GlobalPhone database, we show that, under such restrictive conditions, the feature transformations can be advantageously shared across languages and robustly trained using data from several languages.
BibTeX:
@INPROCEEDINGS{
   author = {Martin Karafi{\'{a}}t and Milo{\v{s}} Janda and Jan
	{\v{C}}ernock{\'{y}} and Luk{\'{a}}{\v{s}} Burget},
   title = {Region Dependent Linear Transforms in Multilingual Speech
	Recognition},
   pages = {4885--4888},
   booktitle = {Proc. International Conference on Acoustics, Speech, and
	Signal Processing 2012},
   year = {2012},
   location = {Kyoto, JP},
   publisher = {IEEE Signal Processing Society},
   ISBN = {978-1-4673-0044-5},
   language = {english},
   url = {http://www.fit.vutbr.cz/research/view_pub.php?id=9935}
}

Your IPv4 address: 54.158.152.80
Switch to IPv6 connection

DNSSEC [dnssec]