Thesis Details

Optimization of Gaussian Mixture Subspace Models and Related Scoring Algorithms in Speaker Verification

Ph.D. Thesis Student: Glembek Ondřej Academic Year: 2012/2013 Supervisor: Burget Lukáš, doc. Ing., Ph.D.
Czech title
Optimalizace modelování gaussovských směsí v podprostorech a jejich skórování v rozpoznávání mluvčího
Language
English
Abstract

This thesis deals with Gaussian Mixture Subspace Modeling in automatic speaker recognition. The thesis consists of three parts.  In the first part, Joint Factor Analysis (JFA) scoring methods are studied.  The methods differ mainly in how they deal with the channel of the tested utterance.  The general JFA likelihood function is investigated and the methods are compared both in terms of accuracy and speed.  It was found that linear approximation of the log-likelihood function gives comparable results to the full log-likelihood evaluation while simplyfing the formula and dramatically reducing the computation speed.

In the second part, i-vector extraction is studied and two simplification methods are proposed. The motivation for this part was to allow for using the state-of-the-art technique on small scale devices and to setup a simple discriminative-training system.  It is shown that, for long utterances, while sacrificing the accuracy, we can get very fast and compact i-vector systems. On a short-utterance(5-second) task, the results of the simplified systems are comparable to the full i-vector extraction.
The third part deals with discriminative training in automatic speaker recognition.  Previous work in the field is summarized and---based on the knowledge from the earlier chapters of this work---discriminative training of the i-vector extractor parameters is proposed.  It is shown that discriminative re-training of the i-vector extractor can improve the system if the initial estimation is computed using the generative approach.
Keywords

Speaker Recognition, Gaussian Mixture Model, Subspace Modeling, i-vector, Joint Factor Analysis, Discriminative Training

Department
Degree Programme
Computer Science and Engineering, Field of Study Computer Science and Engineering
Files
Status
defended
Date
13 November 2012
Citation
GLEMBEK, Ondřej. Optimization of Gaussian Mixture Subspace Models and Related Scoring Algorithms in Speaker Verification. Brno, 2012. Ph.D. Thesis. Brno University of Technology, Faculty of Information Technology. 2012-11-13. Supervised by Burget Lukáš. Available from: https://www.fit.vut.cz/study/phd-thesis/209/
BibTeX
@phdthesis{FITPT209,
    author = "Ond\v{r}ej Glembek",
    type = "Ph.D. thesis",
    title = "Optimization of Gaussian Mixture Subspace Models and Related Scoring Algorithms in Speaker Verification",
    school = "Brno University of Technology, Faculty of Information Technology",
    year = 2012,
    location = "Brno, CZ",
    language = "english",
    url = "https://www.fit.vut.cz/study/phd-thesis/209/"
}
Back to top