Papers
| Ferrer, L., Burget, L., Plchot, O., Scheffer, N.: A Unified Approach for Audio Characterization and its Application to Speaker Recognition, In: Proceedings of Odyssey 2012, The Speaker and Language Recognition Workshop, Singapur, SG, ISCA, 2012, p. 317-323, ISBN 978-981-07-3093-2 | | Publication language: | english |
|---|
| Original title: | A Unified Approach for Audio Characterization and its Application to Speaker Recognition |
|---|
| Title (cs): | Unifikovaný přistup k charakterizaci audio nahrávek a jeho aplikace pro rozpoznávání řečníka |
|---|
| Pages: | 317-323 |
|---|
| Proceedings: | Proceedings of Odyssey 2012, The Speaker and Language Recognition Workshop |
|---|
| Conference: | Odyssey 2012: The Speaker and Language Recognition Workshop |
|---|
| Place: | Singapur, SG |
|---|
| Year: | 2012 |
|---|
| ISBN: | 978-981-07-3093-2 |
|---|
| Publisher: | International Speech Communication Association |
|---|
| URL: | http://www.fit.vutbr.cz/research/groups/speech/publi/2012/ferrer_odyssey2012_317-323-59.pdf [PDF] |
|---|
| Keywords |
|---|
| audio characterization, speaker recognition, i-vector, calibration metadata |
| Annotation |
|---|
| The technique proposed in this work allows for extracting a very low-dimensional vector encoding information about chosen characteristics of audio signal such as: type and level of background noise, reverberation and transmission channel. The reported experimental shows that such information can be very useful for calibration and fusion of speaker verification systems. |
| Abstract |
|---|
| Systems designed to solve speech processing tasks like speech
or speaker recognition, language identification, or emotion detection
are known to be affected by the recording conditions of
the acoustic signal, like the channel, background noise, reverberation,
and so on. Knowledge of the nuisance characteristics
present in the signal can be used to improve performance of the
system. In some cases, the nature of these nuisance characteristics
is known a priori, but in most practical cases it is not. Most
approaches used to automatically detect the characteristics of a
signal are designed for a specific type of effect: noise, reverberation,
language, type of channel, and so on. We propose
a method for detecting the audio characteristics of a signal in a
unified way, based on iVectors. We show results for the detector
itself and for its use as metadata during calibration of a state-ofthe-
art speaker recognition system based on iVectors extracted
from Mel frequency cepstral coefficients. Results show relative
gains in equal error rate of up to 15% in a variety of recording
conditions. |
| BibTeX: |
|---|
@INPROCEEDINGS{
author = {Luciana Ferrer and Lukáš Burget and Oldřich Plchot and
Nicolas Scheffer},
title = {A Unified Approach for Audio Characterization and its
Application to Speaker Recognition},
pages = {317--323},
booktitle = {Proceedings of Odyssey 2012, The Speaker and Language
Recognition Workshop},
year = {2012},
location = {Singapur, SG},
publisher = {International Speech Communication Association},
ISBN = {978-981-07-3093-2},
language = {english},
url = {http://www.fit.vutbr.cz/research/view_pub.php?id=10053}
} |
|