Conference paper

ZEINALI Hossein, BURGET Lukáš, SAMETI Hossein, GLEMBEK Ondřej and PLCHOT Oldřich. Deep Neural Networks and Hidden Markov Models in i-vector-based Text-Dependent Speaker Verification. In: Proceedings of Odyssey 2016, The Speaker and Language Recognition Workshop. Bilbao: International Speech Communication Association, 2016, pp. 24-30. ISSN 2312-2846. Available from: http://www.odyssey2016.org/papers/pdfs_stamped/63.pdf
Publication language:english
Original title:Deep Neural Networks and Hidden Markov Models in i-vector-based Text-Dependent Speaker Verification
Title (cs):Hluboké neuronové sítě a skryté Markovovy modely v i-vektorovém systému pro ověřování mluvčího závislém na textu
Pages:24-30
Proceedings:Proceedings of Odyssey 2016, The Speaker and Language Recognition Workshop
Conference:Odyssey 2016
Place:Bilbao, ES
Year:2016
URL:http://www.odyssey2016.org/papers/pdfs_stamped/63.pdf
Journal:Proceedings of Odyssey: The Speaker and Language Recognition Workshop, Vol. 2016, No. 06, 4 Rue des Fauvettes - Lous Tourils, F-66390 BAIXAS, FR
ISSN:2312-2846
Publisher:International Speech Communication Association
URL:http://www.fit.vutbr.cz/research/groups/speech/publi/2016/zeinali_odyssey2016_stamped_63.pdf [PDF]
Files: 
+Type Name Title Size Last modified
iconzeinali_odyssey2016_stamped_63.pdf447 KB2017-03-06 11:00:35
^ Select all
With selected:
Keywords
deep neural networks,  hidden Markov Models, i-vector-based, text-dependent, speaker verification
Annotation
This article is about deep neural networks and hidden Markov models in i-vector-based text-dependent speaker verification.
Abstract
Techniques making use of Deep Neural Networks (DNN) have recently been seen to bring large improvements in textindependent speaker recognition. In this paper, we verify that the DNN based methods result in excellent performances in the context of text-dependent speaker verification as well. We build our system on the previously introduced HMM based ivector approach, where phone models are used to obtain frame level alignment in order to collect sufficient statistics for ivector extraction. For comparison, we experiment with an alternative alignment obtained directly from the output of DNN trained for phone classification. We also experiment with DNN based bottleneck features and their combinations with standard cepstral features. Although the i-vector approach is generally considered not suitable for text-dependent speaker verification, we show that our HMM based approach combined with bottleneck features provides truly state-of-the-art performance on RSR2015 data.
BibTeX:
@INPROCEEDINGS{
   author = {Hossein Zeinali and Luk{\'{a}}{\v{s}} Burget and Hossein
	Sameti and Ond{\v{r}}ej Glembek and Old{\v{r}}ich Plchot},
   title = {Deep Neural Networks and Hidden Markov Models in
	i-vector-based Text-Dependent Speaker Verification},
   pages = {24--30},
   booktitle = {Proceedings of Odyssey 2016, The Speaker and Language
	Recognition Workshop},
   journal = {Proceedings of Odyssey: The Speaker and Language Recognition
	Workshop},
   volume = {2016},
   number = {06},
   year = {2016},
   location = {Bilbao, ES},
   publisher = {International Speech Communication Association},
   ISSN = {2312-2846},
   language = {english},
   url = {http://www.fit.vutbr.cz/research/view_pub.php?id=11220}
}

Your IPv4 address: 54.146.47.178
Switch to IPv6 connection

DNSSEC [dnssec]