Conference paperMOTLÍČEK Petr, DEY Subhadeep, MADIKERI Srikanth and BURGET Lukáš. Employment of Subspace Gaussian Mixture Models in Speaker Recognition. In: Proceedings of 2015 IEEE International Conference on Acoustics, Speech and Signal Processing. South Brisbane, Queensland: IEEE Signal Processing Society, 2015, pp. 4445-4449. ISBN 978-1-4673-6997-8. | Publication language: | english |
---|
Original title: | Employment of Subspace Gaussian Mixture Models in Speaker Recognition |
---|
Title (cs): | Využití podprostorových modelů Gaussovských směsí pro rozpoznávání mluvčího |
---|
Pages: | 4445-4449 |
---|
Proceedings: | Proceedings of 2015 IEEE International Conference on Acoustics, Speech and Signal Processing |
---|
Conference: | 40th International Conference on Acoustics, Speech and Signal Processing is starting |
---|
Place: | South Brisbane, Queensland, AU |
---|
Year: | 2015 |
---|
ISBN: | 978-1-4673-6997-8 |
---|
Publisher: | IEEE Signal Processing Society |
---|
URL: | http://www.fit.vutbr.cz/research/groups/speech/publi/2015/motlicek_icassp2015_0004445.pdf [PDF] |
---|
Files: | |
---|
| Keywords |
---|
speaker recognition, i-vectors, subspace Gaussian
mixture models, automatic speech recognition |
Annotation |
---|
In this paper, we proposed an alternative approach for speaker recognition
based on employment of speaker vectors estimated using the
SGMM framework. |
Abstract |
---|
This paper presents Subspace Gaussian Mixture Model (SGMM)
approach employed as a probabilistic generative model to estimate
speaker vector representations to be subsequently used in the speaker
verification task. SGMMs have already been shown to significantly
outperform traditional HMM/GMMs in Automatic Speech Recognition
(ASR) applications. An extension to the basic SGMM framework
allows to robustly estimate low-dimensional speaker vectors
and exploit them for speaker adaptation. We propose a speaker verification
framework based on low-dimensional speaker vectors estimated
using SGMMs, trained in ASR manner using manual transcriptions.
To test the robustness of the system, we evaluate the
proposed approach with respect to the state-of-the-art i-vector extractor
on the NIST SRE 2010 evaluation set and on four different
length-utterance conditions: 3sec-10sec, 10 sec-30 sec, 30 sec-60 sec
and full (untruncated) utterances. Experimental results reveal that
while i-vector system performs better on truncated 3sec to 10sec and
10 sec to 30 sec utterances, noticeable improvements are observed
with SGMMs especially on full length-utterance durations. Eventually,
the proposed SGMM approach exhibits complementary properties
and can thus be efficiently fused with i-vector based speaker
verification system. |
BibTeX: |
---|
@INPROCEEDINGS{
author = {Petr Motl{\'{i}}{\v{c}}ek and Subhadeep Dey and Srikanth
Madikeri and Luk{\'{a}}{\v{s}} Burget},
title = {Employment of Subspace Gaussian Mixture Models in Speaker
Recognition},
pages = {4445--4449},
booktitle = {Proceedings of 2015 IEEE International Conference on
Acoustics, Speech and Signal Processing},
year = {2015},
location = {South Brisbane, Queensland, AU},
publisher = {IEEE Signal Processing Society},
ISBN = {978-1-4673-6997-8},
language = {english},
url = {http://www.fit.vutbr.cz/research/view_pub.php.en.iso-8859-2?id=10952}
} |
|