Publication Details

Phonotactic Language Recognition using i-vectors and Phoneme Posteriogram Counts

D'HARO Luis Fernando, GLEMBEK Ondřej, PLCHOT Oldřich, MATĚJKA Pavel, SOUFIFAR Mehdi Mohammad, CORDOBA Ricardo and ČERNOCKÝ Jan. Phonotactic Language Recognition using i-vectors and Phoneme Posteriogram Counts. In: Proceedings of Interspeech 2012. Portland, Oregon: International Speech Communication Association, 2012, pp. 1-4. ISBN 978-1-62276-759-5. ISSN 1990-9772. Available from: http://www.isca-speech.org/archive/interspeech_2012/i12_0042.html
Czech title
Fonotaktické rozpoznávání jazyka využívající i-vektory a počty z fonémových posteriogramů
Type
conference paper
Language
english
Authors
D'Haro Luis Fernando (UPN)
Glembek Ondřej, Ing., Ph.D. (DCGM FIT BUT)
Plchot Oldřich, Ing., Ph.D. (DCGM FIT BUT)
Matějka Pavel, Ing., Ph.D. (DCGM FIT BUT)
Soufifar Mehdi Mohammad, Ing. (FIT BUT)
Cordoba Ricardo (UPN)
Černocký Jan, prof. Dr. Ing. (DCGM FIT BUT)
URL
Keywords

subspace modeling, multinomial distributions, LID

Abstract

The article is about a Phonotactic Language Recognition using i-vectors and Phoneme Posteriogram Counts.

Annotation

This paper describes a novel approach to phonotactic LID, where instead of using soft-counts based on phoneme lattices, we use posteriogram to obtain n-gram counts. The high-dimensional vectors of counts are reduced to low-dimensional units for which we adapted the commonly used term i-vectors. The reduction is based on multinomial subspace modeling and is designed to work in the total-variability space. The proposed technique was tested on the NIST 2009 LRE set with better results to a system based on using soft-counts (Cavg on 30s: 3.15% vs 3.43%), and with very good results when fused with an acoustic i-vector LID system (Cavg on 30s acoustic 2.4% vs 1.25%). The proposed technique is also compared with another low dimensional projection system based on PCA. In comparison with the original soft-counts, the proposed technique provides better results, reduces the problems due to sparse counts, and avoids the process of using pruning techniques when creating the lattices.

Published
2012
Pages
1-4
Journal
Proceedings of Interspeech - on-line, vol. 2012, no. 9, ISSN 1990-9772
Proceedings
Proceedings of Interspeech 2012
Conference
Interspeech Conference, Portland, US
ISBN
978-1-62276-759-5
Publisher
International Speech Communication Association
Place
Portland, Oregon, US
BibTeX
@INPROCEEDINGS{FITPUB10093,
   author = "Fernando Luis D'Haro and Ond\v{r}ej Glembek and Old\v{r}ich Plchot and Pavel Mat\v{e}jka and Mohammad Mehdi Soufifar and Ricardo Cordoba and Jan \v{C}ernock\'{y}",
   title = "Phonotactic Language Recognition using i-vectors and Phoneme Posteriogram Counts",
   pages = "1--4",
   booktitle = "Proceedings of Interspeech 2012",
   journal = "Proceedings of Interspeech - on-line",
   volume = 2012,
   number = 9,
   year = 2012,
   location = "Portland, Oregon, US",
   publisher = "International Speech Communication Association",
   ISBN = "978-1-62276-759-5",
   ISSN = "1990-9772",
   language = "english",
   url = "https://www.fit.vut.cz/research/publication/10093"
}
Back to top