Conference paper

LOZANO-DIEZ Alicia, PLCHOT Oldřich, MATĚJKA Pavel and GONZALEZ-RODRIGUEZ Joaquin. DNN Based Embeddings for Language Recognition. In: Proceedings of ICASSP 2018. Calgary: IEEE Signal Processing Society, 2018, pp. 5184-5188. ISBN 978-1-5386-4658-8.
Publication language:english
Original title:DNN Based Embeddings for Language Recognition
Title (cs):DNN Embeddings pro rozpoznávání jazyka
Pages:5184-5188
Proceedings:Proceedings of ICASSP 2018
Conference:2018 IEEE International Conference on Acoustics, Speech and Signal Processing
Place:Calgary, CA
Year:2018
ISBN:978-1-5386-4658-8
Publisher:IEEE Signal Processing Society
URL:http://www.fit.vutbr.cz/research/groups/speech/publi/2018/lozano_icassp2018_0005184.pdf [PDF]
Files: 
Keywords
Embeddings, language recognition, LID, DNN
Annotation
In this work, we present a language identification (LID) system based on embeddings. In our case, an embedding is a fixed-length vector (similar to i-vector) that represents the whole utterance, but unlike i-vector it is designed to contain mostly information relevant to the target task (LID). In order to obtain these embeddings, we train a deep neural network (DNN) with sequence summarization layer to classify languages. In particular, we trained a DNN based on bidirectional long short-term memory (BLSTM) recurrent neural network (RNN) layers, whose frame-by-frame outputs are summarized into mean and standard deviation statistics. After this pooling layer, we add two fully connected layers whose outputs correspond to embeddings. Finally, we add a softmax output layer and train the whole network with multi-class cross-entropy objective to discriminate between languages. We report our results on NIST LRE 2015 and we compare the performance of embeddings and corresponding i-vectors both modeled by Gaussian Linear Classifier (GLC). Using only embeddings resulted in comparable performance to i-vectors and by performing score-level fusion we achieved 7.3% relative improvement over the baseline.
BibTeX:
@INPROCEEDINGS{
   author = {Alicia Lozano-Diez and Old{\v{r}}ich Plchot and Pavel
	Mat{\v{e}}jka and Joaquin Gonzalez-Rodriguez},
   title = {DNN Based Embeddings for Language Recognition},
   pages = {5184--5188},
   booktitle = {Proceedings of ICASSP 2018},
   year = {2018},
   location = {Calgary, CA},
   publisher = {IEEE Signal Processing Society},
   ISBN = {978-1-5386-4658-8},
   language = {english},
   url = {http://www.fit.vutbr.cz/research/view_pub.php?id=11723}
}

Your IPv4 address: 54.81.0.22
Switch to IPv6 connection

DNSSEC [dnssec]