Článek ve sborníku konference | |
| Lei, Y., Burget, L., Ferrer, L., Graciarena, M., Scheffer, N.: Towards Noise-Robust Speaker Recognition Using Probabilistic Linear Discriminant Analysis, In: Proc. International Conference on Acoustics, Speech, and Signal P, Kyoto, JP, IEEESP, 2012, s. 4253-4256, ISBN 978-1-4673-0044-5 | | Jazyk publikace: | angličtina |
|---|
| Název publikace: | Towards Noise-Robust Speaker Recognition Using Probabilistic Linear Discriminant Analysis |
|---|
| Název (cs): | Posun k rozpoznávání mluvčího robustnímu vůči šumu pomocí pravděpodobnostní lineární diskriminační analýzy |
|---|
| Strany: | 4253-4256 |
|---|
| Sborník: | Proc. International Conference on Acoustics, Speech, and Signal P |
|---|
| Konference: | The 37th International Conference on Acoustics, Speech, and Signal Processing |
|---|
| Místo vydání: | Kyoto, JP |
|---|
| Rok: | 2012 |
|---|
| ISBN: | 978-1-4673-0044-5 |
|---|
| Vydavatel: | IEEE Signal Processing Society |
|---|
| URL: | http://www.fit.vutbr.cz/research/groups/speech/publi/2012/lei_icassp2012_0004253.pdf [PDF] |
|---|
| Klíčová slova |
|---|
| Speaker Recognition, noise, robustness,
i-vector, PLDA |
| Anotace |
|---|
| Tento článek pojednává o posunu v oblasti rozpoznávání mluvčího, který je robustní vůči šumu, pomocí pravděpodobnostní lineární diskriminační analýzy |
| Abstrakt |
|---|
| This work addresses the problem of speaker verification
where additive noise is present in the enrollment and testing
utterances. We show how the current state-of-the-art framework
can be effectively used to mitigate this effect. We first
look at the degradation a standard speaker verification system
is subjected to when presented with noisy speech waveforms.
We designed and generated a corpus with noisy conditions,
based on the NIST SRE 2008 and 2010 data, built using
open-source tools and freely available noise samples. We then
show how adding noisy training data in the current i-vectorbased
approach followed by probabilistic linear discriminant
analysis (PLDA) can bring significant gains in accuracy at
various signal-to-noise ratio (SNR) levels. We demonstrate
that this improvement is not feature-specific as we present
positive results for three disparate sets of features: standard
mel frequency cepstral coefficients, prosodic polynomial coefficients
and maximum likelihood linear regression (MLLR)
transforms. |
| BibTeX: |
|---|
@INPROCEEDINGS{
author = {Yun Lei and Lukáš Burget and Luciana Ferrer and Martin
Graciarena and Nicolas Scheffer},
title = {Towards Noise-Robust Speaker Recognition Using Probabilistic
Linear Discriminant Analysis},
pages = {4253--4256},
booktitle = {Proc. International Conference on Acoustics, Speech, and
Signal P},
year = {2012},
location = {Kyoto, JP},
publisher = {IEEE Signal Processing Society},
ISBN = {978-1-4673-0044-5},
language = {english},
url = {http://www.fit.vutbr.cz/research/view_pub.php?id=9996}
} |
|