Journal article

ZEINALI Hossein, SAMETI Hossein and BURGET Lukáš. HMM-Based Phrase-Independent i-Vector Extractor for Text-Dependent Speaker Verification. IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING. New York City: IEEE Signal Processing Society, 2017, vol. 25, no. 7, pp. 1421-1435. ISSN 2329-9290. Available from: http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=7902120
Publication language:english
Original title:HMM-Based Phrase-Independent i-Vector Extractor for Text-Dependent Speaker Verification
Title (cs):Extraktor i-vektorů pro ověřování mluvčího závislé na textu založený na HMM a nezávislý na promluvě
Pages:1421-1435
Place:US
Year:2017
URL:http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=7902120
Journal:IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING, Vol. 25, No. 7, New York City, US
ISSN:2329-9290
URL:http://www.fit.vutbr.cz/research/groups/speech/publi/2017/zeinali_ieee_acm%20transactions2017_07902120.pdf [PDF]
Files: 
+Type Name Title Size Last modified
iconzeinali_ieee_acm transactions2017_07902120.pdf795 KB2017-06-09 13:20:02
^ Select all
With selected:
Keywords
Bottleneck features, DNN, hidden Markov model (HMM), i-vector, text-dependent speaker verification.
Annotation
This article is describes a new HMM structure for text-dependent speaker verification, which enabled the authors to use the potential of the HMM to model time sequences along with the established i-vector technique.
Abstract
Abstract-The low-dimensional i-vector representation of speech segments is used in the state-of-the-art text-independent speaker verification systems. However, i-vectors were deemed unsuitable for the text-dependent task, where simpler and older speaker recognition approaches were found more effective. In this work,we propose a straightforward hiddenMarkovmodel (HMM) based extension of the i-vector approach, which allows i-vectors to be successfully applied to text-dependent speaker verification. In our approach, the Universal Background Model (UBM) for training phrase-independent i-vector extractor is based on a set of monophone HMMs instead of the standard Gaussian Mixture Model (GMM). To compensate for the channel variability, we propose to precondition i-vectors using a regularized variant of within-class covariance normalization, which can be robustly estimated in a phrase-dependent fashion on the small datasets available for the text-dependent task. The verification scores are cosine similarities between the i-vectors normalized using phrase-dependent s-norm. The experimental results on RSR2015 and RedDots databases confirm the effectiveness of the proposed approach, especially in rejecting test utterances with a wrong phrase. A simpleMFCC based i-vector/HMM system performs competitively when compared to very computationally expensive DNN-based approaches or the conventional relevance MAP GMM-UBM, which does not allow for compact speaker representations. To our knowledge, this paper presents the best published results obtained with a single system on both RSR2015 and RedDots dataset.
BibTeX:
@ARTICLE{
   author = {Hossein Zeinali and Hossein Sameti and Luk{\'{a}}{\v{s}}
	Burget},
   title = {HMM-Based Phrase-Independent i-Vector Extractor for
	Text-Dependent Speaker Verification},
   pages = {1421--1435},
   journal = {IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE
	PROCESSING},
   volume = {25},
   number = {7},
   year = {2017},
   ISSN = {2329-9290},
   language = {english},
   url = {http://www.fit.vutbr.cz/research/view_pub.php?id=11466}
}

Your IPv4 address: 54.146.50.80
Switch to IPv6 connection

DNSSEC [dnssec]