Článek ve sborníku konference

VESELÝ Karel, WATANABE Shinji, ŽMOLÍKOVÁ Kateřina, KARAFIÁT Martin, BURGET Lukáš a ČERNOCKÝ Jan. Sequence Summarizing Neural Network for Speaker Adaptation. In: Proceedings of the 41th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2016), 2016. Shanghai: IEEE Signal Processing Society, 2016, s. 5315-5319. ISBN 978-1-4799-9988-0.
Jazyk publikace:angličtina
Název publikace:Sequence Summarizing Neural Network for Speaker Adaptation
Název (cs):Neuronové sítě shrnující sekvence pro adaptaci na mluvčího
Strany:5315-5319
Sborník:Proceedings of the 41th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2016), 2016
Konference:41th IEEE International Conference on Acoustics, Speech and Signal Processing
Místo vydání:Shanghai, CN
Rok:2016
ISBN:978-1-4799-9988-0
DOI:10.1109/ICASSP.2016.7472692
Vydavatel:IEEE Signal Processing Society
URL:http://www.fit.vutbr.cz/research/groups/speech/publi/2016/vesely_icassp2016_0005315.pdf [PDF]
Soubory: 
+Typ Jméno Název +Vel. Poslední změna
iconvesely_icassp2016_0005315.pdf178 KB2016-09-29 18:16:17
^ Vybrat vše
S vybranými:
Klíčová slova
DNN, adaptation, i-vector, sequence summary, SSNN
Anotace
Článek pojednává o neuronových sítích, které shrnují sekvence pro adaptaci na mluvčího. V tomto článku jsme navrhli alternativní metodu k získání DNN adaptation vectors podobných i-vektorům.
Abstrakt
In this paper, we propose a DNN adaptation technique, where the i-vector extractor is replaced by a Sequence Summarizing Neural Network (SSNN). Similarly to i-vector extractor, the SSNN produces a "summary vector", representing an acoustic summary of an utterance. Such vector is then appended to the input of main network, while both networks are trained together optimizing single loss function. Both the i-vector and SSNN speaker adaptation methods are compared on AMI meeting data. The results show comparable performance of both techniques on FBANK system with frameclassification training. Moreover, appending both the i-vector and "summary vector" to the FBANK features leads to additional improvement comparable to the performance of FMLLR adapted DNN system.
BibTeX:
@INPROCEEDINGS{
   author = {Karel Vesel{\'{y}} and Shinji Watanabe and
	Kate{\v{r}}ina {\v{Z}}mol{\'{i}}kov{\'{a}} and
	Martin Karafi{\'{a}}t and Luk{\'{a}}{\v{s}} Burget
	and Jan {\v{C}}ernock{\'{y}}},
   title = {Sequence Summarizing Neural Network for Speaker
	Adaptation},
   pages = {5315--5319},
   booktitle = {Proceedings of the 41th IEEE International Conference on
	Acoustics, Speech and Signal Processing (ICASSP 2016), 2016},
   year = 2016,
   location = {Shanghai, CN},
   publisher = {IEEE Signal Processing Society},
   ISBN = {978-1-4799-9988-0},
   doi = {10.1109/ICASSP.2016.7472692},
   language = {english},
   url = {http://www.fit.vutbr.cz/research/view_pub.php.cs?id=11145}
}

Vaše IPv4 adresa: 18.206.13.39
Přepnout na https