Článek ve sborníku konference

 
Lei, Y., Burget, L., Ferrer, L., Graciarena, M., Scheffer, N.: Towards Noise-Robust Speaker Recognition Using Probabilistic Linear Discriminant Analysis, In: Proc. International Conference on Acoustics, Speech, and Signal P, Kyoto, JP, IEEESP, 2012, s. 4253-4256, ISBN 978-1-4673-0044-5
Jazyk publikace:angličtina
Název publikace:Towards Noise-Robust Speaker Recognition Using Probabilistic Linear Discriminant Analysis
Název (cs):Posun k rozpoznávání mluvčího robustnímu vůči šumu pomocí pravděpodobnostní lineární diskriminační analýzy
Strany:4253-4256
Sborník:Proc. International Conference on Acoustics, Speech, and Signal P
Konference:The 37th International Conference on Acoustics, Speech, and Signal Processing
Místo vydání:Kyoto, JP
Rok:2012
ISBN:978-1-4673-0044-5
Vydavatel:IEEE Signal Processing Society
URL:http://www.fit.vutbr.cz/research/groups/speech/publi/2012/lei_icassp2012_0004253.pdf [PDF]
Klíčová slova
Speaker Recognition, noise, robustness, i-vector, PLDA
Anotace
Tento článek pojednává o posunu v oblasti rozpoznávání mluvčího, který je robustní vůči šumu, pomocí pravděpodobnostní lineární diskriminační analýzy
Abstrakt
This work addresses the problem of speaker verification where additive noise is present in the enrollment and testing utterances. We show how the current state-of-the-art framework can be effectively used to mitigate this effect. We first look at the degradation a standard speaker verification system is subjected to when presented with noisy speech waveforms. We designed and generated a corpus with noisy conditions, based on the NIST SRE 2008 and 2010 data, built using open-source tools and freely available noise samples. We then show how adding noisy training data in the current i-vectorbased approach followed by probabilistic linear discriminant analysis (PLDA) can bring significant gains in accuracy at various signal-to-noise ratio (SNR) levels. We demonstrate that this improvement is not feature-specific as we present positive results for three disparate sets of features: standard mel frequency cepstral coefficients, prosodic polynomial coefficients and maximum likelihood linear regression (MLLR) transforms.
BibTeX:
@INPROCEEDINGS{
   author = {Yun Lei and Lukáš Burget and Luciana Ferrer and Martin
	Graciarena and Nicolas Scheffer},
   title = {Towards Noise-Robust Speaker Recognition Using Probabilistic
	Linear Discriminant Analysis},
   pages = {4253--4256},
   booktitle = {Proc. International Conference on Acoustics, Speech, and
	Signal P},
   year = {2012},
   location = {Kyoto, JP},
   publisher = {IEEE Signal Processing Society},
   ISBN = {978-1-4673-0044-5},
   language = {english},
   url = {http://www.fit.vutbr.cz/research/view_pub.php?id=9996}
}