Conference paper

NOVOTNÝ Ondřej, PLCHOT Oldřich, MATĚJKA Pavel, MOŠNER Ladislav and GLEMBEK Ondřej. On the use of X-vectors for Robust Speaker Recognition. In: Proceedings of Odyssey 2018. Les Sables d´Olonne: International Speech Communication Association, 2018, pp. 168-175. ISSN 2312-2846.
Publication language:english
Original title:On the use of X-vectors for Robust Speaker Recognition
Title (cs):K použití x-vektorů pro robustní rozpoznávání mluvčího
Pages:168-175
Proceedings:Proceedings of Odyssey 2018
Conference:Odyssey 2018
Place:Les Sables d´Olonne, FR
Year:2018
Journal:Proceedings of Odyssey: The Speaker and Language Recognition Workshop, Vol. 2018, No. 6, 4 Rue des Fauvettes - Lous Tourils, F-66390 BAIXAS, FR
ISSN:2312-2846
DOI:10.21437/Odyssey.2018-24
Publisher:International Speech Communication Association
URL:http://www.fit.vutbr.cz/research/groups/speech/publi/2018/novotny_odyssey2018_54.pdf [PDF]
Keywords
Speaker Recognition, Embedding, X-vectors, DNN
Annotation
Text-independent speaker verification (SV) is currently in the process of embracing DNN modeling in every stage of SV system. Slowly, the DNN-based approaches such as end-to-end modelling and systems based on DNN embeddings start to be competitive even in challenging and diverse channel conditions of recent NIST SREs. Domain adaptation and the need for a large amount of training data are still a challenge for current discriminative systems and (unlike with generative models), we see significant gains from data augmentation, simulation and other techniques designed to overcome lack of training data. We present an analysis of a SV system based on DNN embeddings (x-vectors) and focus on robustness across diverse data domains such as standard telephone and microphone conversations, both in clean, noisy and reverberant environments. We also evaluate the system on challenging far-field data created by re-transmitting a subset of NIST SRE 2008 and 2010 microphone interviews. We compare our results with the stateof- the-art i-vector system. In general, we were able to achieve better performance with the DNN-based systems, but most importantly, we have confirmed the robustness of such systems across multiple data domains.
BibTeX:
@INPROCEEDINGS{
   author = {Ond{\v{r}}ej Novotn{\'{y}} and Old{\v{r}}ich Plchot and
	Pavel Mat{\v{e}}jka and Ladislav Mo{\v{s}}ner and
	Ond{\v{r}}ej Glembek},
   title = {On the use of X-vectors for Robust Speaker Recognition},
   pages = {168--175},
   booktitle = {Proceedings of Odyssey 2018},
   journal = {Proceedings of Odyssey: The Speaker and Language Recognition
	Workshop},
   volume = {2018},
   number = {6},
   year = {2018},
   location = {Les Sables dOlonne, FR},
   publisher = {International Speech Communication Association},
   ISSN = {2312-2846},
   doi = {10.21437/Odyssey.2018-24},
   language = {english},
   url = {http://www.fit.vutbr.cz/research/view_pub.php?id=11787}
}

Your IPv4 address: 54.198.15.20
Switch to IPv6 connection

DNSSEC [dnssec]