Conference paper

BOŘIL Hynek, GRÉZL František and HANSEN John H. Front-End Compensation Methods for LVCSR Under Lombard Effect. In: Proceedings of Interspeech 2011. Florence: International Speech Communication Association, 2011, pp. 1257-1260. ISBN 978-1-61839-270-1. ISSN 1990-9772.
Publication language:english
Original title:Front-End Compensation Methods for LVCSR Under Lombard Effect
Title (cs):Kompenzační techniky Front-Endu pro LVCSR řeči ovlivněné Lombardovým efektem
Pages:1257-1260
Proceedings:Proceedings of Interspeech 2011
Conference:Interspeech 2011
Place:Florence, IT
Year:2011
ISBN:978-1-61839-270-1
Journal:Proceedings of Interspeech, Vol. 2011, No. 8, FR
ISSN:1990-9772
Publisher:International Speech Communication Association
URL:http://www.fit.vutbr.cz/research/groups/speech/publi/2011/boril_interspeech2011_221.pdf [PDF]
Keywords
speech recognition, Lombard effect, UT-Scope database, bottleneck features, quantile-based cepstral distribution normalization, histogram equalization
Annotation
This paper describes a Front-End Compensation Methods for LVCSR (Large Vocabulary Continuous Speech Recognition) Under Lombard Effect.
Abstract
This study analyzes the impact of noisy background variations and Lombard effect (LE) on large vocabulary continuous speech recognition (LVCSR). Robustness of several front-end feature extraction strategies combined with state-of-the-art feature distribution normalizations is tested on neutral and Lombard speech from the UT-Scope database presented in two types of background noise at various levels of SNR. An extension of a bottleneck (BN) front-end utilizing normalization of both critical band energies (CRBE) and BN outputs is proposed and shown to provide a competitive performance compared to the best MFCC-based system. A novel MFCC-based BN front-end is introduced and shown to outperform all other systems in all conditions considered (average 4.1% absolute WER reduction over the second best system). Additionally, two phenomena are observed: (i) combination of cepstral mean subtraction and recently established RASTALP filtering significantly reduces transient effects of RASTA band-pass filtering and increases ASR robustness to noise and LE; (ii) histogram equalization may benefit from utilizing reference distributions derived from pre-normalized rather than raw training features, and also from adopting distributions from different front-ends.
BibTeX:
@INPROCEEDINGS{
   author = {Hynek Bo{\v{r}}il and Franti{\v{s}}ek Gr{\'{e}}zl and H.
	John Hansen},
   title = {Front-End Compensation Methods for LVCSR Under Lombard
	Effect},
   pages = {1257--1260},
   booktitle = {Proceedings of Interspeech 2011},
   journal = {Proceedings of Interspeech},
   volume = {2011},
   number = {8},
   year = {2011},
   location = {Florence, IT},
   publisher = {International Speech Communication Association},
   ISBN = {978-1-61839-270-1},
   ISSN = {1990-9772},
   language = {english},
   url = {http://www.fit.vutbr.cz/research/view_pub.php?id=9756}
}

Your IPv4 address: 54.224.158.232
Switch to IPv6 connection

DNSSEC [dnssec]