Conference paper

HSIAO Roger, MA Jeff, HARTMANN William, KARAFIÁT Martin, GRÉZL František, BURGET Lukáš, SZŐKE Igor, ČERNOCKÝ Jan, WATANABE Shinji, CHEN Zhuo, MALLIDI Sri Harish, HEŘMANSKÝ Hynek, TSAKALIDIS Stavros and SCHWARTZ Richard. Robust Speech Recognition in Unknown Reverberant and Noisy Conditions. In: Proceedings of 2015 IEEE Automatic Speech Recognition and Understanding Workshop. Scottsdale, Arizona: IEEE Signal Processing Society, 2015, pp. 533-538. ISBN 978-1-4799-7291-3.
Publication language:english
Original title:Robust Speech Recognition in Unknown Reverberant and Noisy Conditions
Title (cs):Robustní rozpoznávání řeči v neznámých podmínkách s reverberací a šumem
Pages:533-538
Proceedings:Proceedings of 2015 IEEE Automatic Speech Recognition and Understanding Workshop
Conference:The 2015 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU 2015)
Place:Scottsdale, Arizona, US
Year:2015
ISBN:978-1-4799-7291-3
Publisher:IEEE Signal Processing Society
URL:http://www.fit.vutbr.cz/research/groups/speech/publi/2015/hsiao_asru2015_0000533.pdf [PDF]
Files: 
+Type Name Title Size Last modified
iconhsiao_asru2015_0000533.pdf139 KB2017-03-01 18:39:18
^ Select all
With selected:
Keywords
ASpIRE challenge, robust speech recognition
Annotation
In this paper, we describe our work in the ASpIRE challenge. We experiment and evaluate different approaches to tackling the performance degradation due to noise and data mismatch. Our approaches include audio enhancement, data augmentation, unsupervised DNN adaptation, and system combination.
Abstract
In this paper, we describe our work on the ASpIRE (Automatic Speech recognition In Reverberant Environments) challenge, which aims to assess the robustness of automatic speech recognition (ASR) systems. The main characteristic of the challenge is developing a high-performance system without access to matched training and development data. While the evaluation data are recorded with far-field microphones in noisy and reverberant rooms, the training data are telephone speech and close talking. Our approach to this challenge includes speech enhancement, neural network methods and acoustic model adaptation, We show that these techniques can successfully alleviate the performance degradation due to noisy audio and data mismatch.
BibTeX:
@INPROCEEDINGS{
   author = {Roger Hsiao and Jeff Ma and William Hartmann and Martin
	Karafi{\'{a}}t and Franti{\v{s}}ek Gr{\'{e}}zl and
	Luk{\'{a}}{\v{s}} Burget and Igor Sz{\H{o}}ke and Jan
	{\v{C}}ernock{\'{y}} and Shinji Watanabe and Zhuo Chen and
	Harish Sri Mallidi and Hynek He{\v{r}}mansk{\'{y}} and
	Stavros Tsakalidis and Richard Schwartz},
   title = {Robust Speech Recognition in Unknown Reverberant and Noisy
	Conditions},
   pages = {533--538},
   booktitle = {Proceedings of 2015 IEEE Automatic Speech Recognition and
	Understanding Workshop},
   year = {2015},
   location = {Scottsdale, Arizona, US},
   publisher = {IEEE Signal Processing Society},
   ISBN = {978-1-4799-7291-3},
   language = {english},
   url = {http://www.fit.vutbr.cz/research/view_pub.php?id=11067}
}

Your IPv4 address: 54.166.19.237
Switch to IPv6 connection

DNSSEC [dnssec]