Článek ve sborníku konference

PEŠÁN Jan, BURGET Lukáš a ČERNOCKÝ Jan. Sequence Summarizing Neural Networks for Spoken Language Recognition. In: Proceedings of Interspeech 2016. San Francisco: International Speech Communication Association, 2016, s. 3285-3289. ISBN 978-1-5108-3313-5. Dostupné z: https://www.researchgate.net/publication/307889421_Sequence_Summarizing_Neural_Networks_for_Spoken_Language_Recognition
Jazyk publikace:angličtina
Název publikace:Sequence Summarizing Neural Networks for Spoken Language Recognition
Název (cs):Sekvenční sumarizační neuronové sítě pro rozpoznávání mluveného jazyka
Strany:3285-3289
Sborník:Proceedings of Interspeech 2016
Konference:Interspeech 2016
Místo vydání:San Francisco, US
Rok:2016
URL:https://www.researchgate.net/publication/307889421_Sequence_Summarizing_Neural_Networks_for_Spoken_Language_Recognition
ISBN:978-1-5108-3313-5
DOI:10.21437/Interspeech.2016-764
Vydavatel:International Speech Communication Association
URL:http://www.fit.vutbr.cz/research/groups/speech/publi/2016/pesan_interspeech2016_IS160764.pdf [PDF]
Soubory: 
+Typ Jméno Název Vel. Poslední změna
iconpesan_interspeech2016_IS160764.pdf234 KB2016-09-29 18:43:13
^ Zrušit výběr
S vybranými:
Klíčová slova
Sequence Summarizing Neural Network, DNN, i-vectors
Anotace
Tento článek pojednává o sekvenčních sumarizačních neuronových sítích pro rozpoznávání mluveného jazyka.
Abstrakt
This paper explores the use of Sequence Summarizing Neural Networks (SSNNs) as a variant of deep neural networks (DNNs) for classifying sequences. In this work, it is applied to the task of spoken language recognition. Unlike other classification tasks in speech processing where the DNN needs to produce a per-frame output, language is considered constant during an utterance. We introduce a summarization component into the DNN structure producing one set of language posteriors per utterance. The training of the DNN is performed by an appropriately modified gradient-descent algorithm. In our initial experiments, the SSNN results are compared to a single state-of-the-art i-vector based baseline system with a similar complexity (i.e. no system fusion, etc.). For some conditions, SSNNs is able to provide performance comparable to the baseline system. Relative improvement up to 30% is obtained with the score level fusion of the baseline and the SSNN systems.
BibTeX:
@INPROCEEDINGS{
   author = {Jan Pe{\v{s}}{\'{a}}n and Luk{\'{a}}{\v{s}} Burget
	and Jan {\v{C}}ernock{\'{y}}},
   title = {Sequence Summarizing Neural Networks for Spoken
	Language Recognition},
   pages = {3285--3289},
   booktitle = {Proceedings of Interspeech 2016},
   year = {2016},
   location = {San Francisco, US},
   publisher = {International Speech Communication Association},
   ISBN = {978-1-5108-3313-5},
   doi = {10.21437/Interspeech.2016-764},
   language = {english},
   url = {http://www.fit.vutbr.cz/research/view_pub.php.cs?id=11273}
}

Vaše IPv4 adresa: 107.23.37.199
Přepnout na https