Publication Details

On the use of X-vectors for Robust Speaker Recognition

NOVOTNÝ Ondřej, PLCHOT Oldřich, MATĚJKA Pavel, MOŠNER Ladislav and GLEMBEK Ondřej. On the use of X-vectors for Robust Speaker Recognition. In: Proceedings of Odyssey 2018. Les Sables d´Olonne: International Speech Communication Association, 2018, pp. 168-175. ISSN 2312-2846.

Czech title

K použití x-vektorů pro robustní rozpoznávání mluvčího

Type

conference paper

Language

english

Authors

Novotný Ondřej, Ing., Ph.D. (DCGM FIT BUT)
Plchot Oldřich, Ing., Ph.D. (DCGM FIT BUT)
Matějka Pavel, Ing., Ph.D. (DCGM FIT BUT)
Mošner Ladislav, Ing. (DCGM FIT BUT)
Glembek Ondřej, Ing., Ph.D. (DCGM FIT BUT)

URL

http://www.fit.vutbr.cz/research/groups/speech/publi/2018/novotny_odyssey2018_54.pdf PDF

Keywords

Speaker Recognition, Embedding, X-vectors, DNN

Abstract

Text-independent speaker verification (SV) is currently in the process of embracing DNN modeling in every stage of SV system. Slowly, the DNN-based approaches such as end-to-end modelling and systems based on DNN embeddings start to be competitive even in challenging and diverse channel conditions of recent NIST SREs. Domain adaptation and the need for a large amount of training data are still a challenge for current discriminative systems and (unlike with generative models), we see significant gains from data augmentation, simulation and other techniques designed to overcome lack of training data. We present an analysis of a SV system based on DNN embeddings (x-vectors) and focus on robustness across diverse data domains such as standard telephone and microphone conversations, both in clean, noisy and reverberant environments. We also evaluate the system on challenging far-field data created by re-transmitting a subset of NIST SRE 2008 and 2010 microphone interviews. We compare our results with the stateof- the-art i-vector system. In general, we were able to achieve better performance with the DNN-based systems, but most importantly, we have confirmed the robustness of such systems across multiple data domains.

Published

2018

Pages

168-175

Journal

Proceedings of Odyssey: The Speaker and Language Recognition Workshop, vol. 2018, no. 6, ISSN 2312-2846

Proceedings

Proceedings of Odyssey 2018

Conference

Odyssey 2018, Les Sables d'Olonne, France, FR

Publisher

International Speech Communication Association

Place

Les Sables d´Olonne, FR

DOI

10.21437/Odyssey.2018-24

BibTeX

@INPROCEEDINGS{FITPUB11787,
   author = "Ond\v{r}ej Novotn\'{y} and Old\v{r}ich Plchot and Pavel Mat\v{e}jka and Ladislav Mo\v{s}ner and Ond\v{r}ej Glembek",
   title = "On the use of X-vectors for Robust Speaker Recognition",
   pages = "168--175",
   booktitle = "Proceedings of Odyssey 2018",
   journal = "Proceedings of Odyssey: The Speaker and Language Recognition Workshop",
   volume = 2018,
   number = 6,
   year = 2018,
   location = "Les Sables dOlonne, FR",
   publisher = "International Speech Communication Association",
   ISSN = "2312-2846",
   doi = "10.21437/Odyssey.2018-24",
   language = "english",
   url = "https://www.fit.vut.cz/research/publication/11787"
}