Conference paper

SOUFIFAR Mehdi Mohammad, BURGET Lukáš, PLCHOT Oldřich, CUMANI Sandro and ČERNOCKÝ Jan. Regularized Subspace n-Gram Model for Phonotactic iVector Extraction. In: Proceedings of Interspeech 2013. Lyon: International Speech Communication Association, 2013, pp. 74-78. ISBN 978-1-62993-443-3. ISSN 2308-457X.
Publication language:english
Original title:Regularized Subspace n-Gram Model for Phonotactic iVector Extraction
Title (cs):Regularizovaný podprostorový n-ramový model pro extrakci fonotaktických iVektorů
Pages:74-78
Proceedings:Proceedings of Interspeech 2013
Conference:Interspeech 2013
Place:Lyon, FR
Year:2013
ISBN:978-1-62993-443-3
Journal:Proceedings of the 14th Annual Conference of the International Speech Communication Association (Interspeech 2013)., No. 8, Lyon, FR
ISSN:2308-457X
Publisher:International Speech Communication Association
URL:http://www.fit.vutbr.cz/research/groups/speech/publi/2013/soufifar_interspeech2013_IS131171.pdf [PDF]
Keywords
Language identification, Subspace modeling, Subspace multinomial model
Annotation
This article describes an enhanced phonotactic iVector extraction model over the n-gram counts. In the first step, a subspace n-gram model is proposed to model conditional n-gram probabilities. Modeling different 3-gram histories with separated multinomial distributions shows promising results for the long condition however, we observed model over-fitting for the short duration conditions.
Abstract
Phonotactic language identification (LID) by means of n-gram statistics and discriminative classifiers is a popular approach for the LID problem. Low-dimensional representation of the n-gram statistics leads to the use of more diverse and efficient machine learning techniques in the LID. Recently, we proposed phototactic iVector as a low-dimensional representation of the n-gram statistics. In this work, an enhanced modeling of the n-gram probabilities along with regularized parameter estimation is proposed. The proposed model consistently improves the LID system performance over all conditions up to 15% relative to the previous state of the art system. The new model also alleviates memory requirement of the iVector extraction and helps to speed up subspace training. Results are presented in terms of Cavg over NIST LRE2009 evaluation set.
BibTeX:
@INPROCEEDINGS{
   author = {Mohammad Mehdi Soufifar and Luk{\'{a}}{\v{s}} Burget and
	Old{\v{r}}ich Plchot and Sandro Cumani and Jan
	{\v{C}}ernock{\'{y}}},
   title = {Regularized Subspace n-Gram Model for Phonotactic iVector
	Extraction},
   pages = {74--78},
   booktitle = {Proceedings of Interspeech 2013},
   journal = {Proceedings of the 14th Annual Conference of the
	International Speech Communication Association (Interspeech
	2013).},
   number = {8},
   year = {2013},
   location = {Lyon, FR},
   publisher = {International Speech Communication Association},
   ISBN = {978-1-62993-443-3},
   ISSN = {2308-457X},
   language = {english},
   url = {http://www.fit.vutbr.cz/research/view_pub.php.en.iso-8859-2?id=10449}
}

Your IPv4 address: 54.81.131.189
Switch to IPv6 connection

DNSSEC [dnssec]