Conference paper

LEI Yun, BURGET Lukáš, FERRER Luciana, GRACIARENA Martin and SCHEFFER Nicolas. Towards Noise-Robust Speaker Recognition Using Probabilistic Linear Discriminant Analysis. In: Proc. International Conference on Acoustics, Speech, and Signal P. Kyoto: IEEE Signal Processing Society, 2012, pp. 4253-4256. ISBN 978-1-4673-0044-5.
Publication language:english
Original title:Towards Noise-Robust Speaker Recognition Using Probabilistic Linear Discriminant Analysis
Title (cs):Posun k rozpoznávání mluvčího robustnímu vůči šumu pomocí pravděpodobnostní lineární diskriminační analýzy
Pages:4253-4256
Proceedings:Proc. International Conference on Acoustics, Speech, and Signal P
Conference:The 37th International Conference on Acoustics, Speech, and Signal Processing
Place:Kyoto, JP
Year:2012
ISBN:978-1-4673-0044-5
Publisher:IEEE Signal Processing Society
URL:http://www.fit.vutbr.cz/research/groups/speech/publi/2012/lei_icassp2012_0004253.pdf [PDF]
Keywords
Speaker Recognition, noise, robustness, i-vector, PLDA
Annotation
We show results on a newly designed noisy corpus for speaker recognition where real recordings of babble noise were added to original NIST SRE clean speech data.
Abstract
This work addresses the problem of speaker verification where additive noise is present in the enrollment and testing utterances. We show how the current state-of-the-art framework can be effectively used to mitigate this effect. We first look at the degradation a standard speaker verification system is subjected to when presented with noisy speech waveforms. We designed and generated a corpus with noisy conditions, based on the NIST SRE 2008 and 2010 data, built using open-source tools and freely available noise samples. We then show how adding noisy training data in the current i-vectorbased approach followed by probabilistic linear discriminant analysis (PLDA) can bring significant gains in accuracy at various signal-to-noise ratio (SNR) levels. We demonstrate that this improvement is not feature-specific as we present positive results for three disparate sets of features: standard mel frequency cepstral coefficients, prosodic polynomial coefficients and maximum likelihood linear regression (MLLR) transforms.
BibTeX:
@INPROCEEDINGS{
   author = {Yun Lei and Luk{\'{a}}{\v{s}} Burget and Luciana Ferrer and
	Martin Graciarena and Nicolas Scheffer},
   title = {Towards Noise-Robust Speaker Recognition Using Probabilistic
	Linear Discriminant Analysis},
   pages = {4253--4256},
   booktitle = {Proc. International Conference on Acoustics, Speech, and
	Signal P},
   year = {2012},
   location = {Kyoto, JP},
   publisher = {IEEE Signal Processing Society},
   ISBN = {978-1-4673-0044-5},
   language = {english},
   url = {http://www.fit.vutbr.cz/research/view_pub.php?id=9996}
}

Your IPv4 address: 54.156.78.4
Switch to IPv6 connection

DNSSEC [dnssec]