Conference paper

ONDEL Lucas, BURGET Lukáš, ČERNOCKÝ Jan and KESIRAJU Santosh. Bayesian phonotactic language model for acoustic unit discovery. In: Proceedings of ICASSP 2017. New Orleans: IEEE Signal Processing Society, 2017, pp. 5750-5754. ISBN 978-1-5090-4117-6.
Publication language:english
Original title:Bayesian phonotactic language model for Acoustic Unit Discovery
Title (cs):Bayesovský fonotaktický jazykový model pro automatické hledání řečových jednotek
Pages:5750-5754
Proceedings:Proceedings of ICASSP 2017
Conference:42nd IEEE International Conference on Acoustics, Speech and Signal Processing
Place:New Orleans, US
Year:2017
ISBN:978-1-5090-4117-6
Publisher:IEEE Signal Processing Society
URL:http://www.fit.vutbr.cz/research/groups/speech/publi/2017/ondel_icassp2017_0005750.pdf [PDF]
Files: 
+Type Name Title Size Last modified
iconondel_icassp2017_0005750.pdf357 KB2017-06-09 17:14:08
^ Select all
With selected:
Keywords
Bayesian non-parametric, Variational Bayes, acoustic unit discovery
Annotation
This article is about Bayesian phonotactic language model for acoustic unit discovery (AUD), which has led to the development of a non-parametric Bayesian phone-loop model.
Abstract
Recent work on Acoustic Unit Discovery (AUD) has led to the development of a non-parametric Bayesian phone-loop model where the prior over the probability of the phone-like units is assumed to be sampled from a Dirichlet Process (DP). In this work, we propose to improve this model by incorporating a Hierarchical Pitman-Yor based bigram Language Model on top of the units transitions. This new model makes use of the phonotactic context information but assumes a fixed number of units. To remedy this limitation we first train a DP phoneloop model to infer the number of units, then, the bigram phone-loop is initialized from the DP phone-loop and trained until convergence of its parameters. Results show an absolute improvement of 1-2%on the Normalized Mutual Information (NMI) metric. Furthermore, we show that, combined with Multilingual Bottleneck (MBN) features the model yields a same or higher NMI as an English phone recogniser trained on TIMIT.
BibTeX:
@INPROCEEDINGS{
   author = {Lucas Ondel and Luk{\'{a}}{\v{s}} Burget and Jan
	{\v{C}}ernock{\'{y}} and Santosh Kesiraju},
   title = {Bayesian phonotactic language model for Acoustic Unit
	Discovery},
   pages = {5750--5754},
   booktitle = {Proceedings of ICASSP 2017},
   year = {2017},
   location = {New Orleans, US},
   publisher = {IEEE Signal Processing Society},
   ISBN = {978-1-5090-4117-6},
   language = {english},
   url = {http://www.fit.vutbr.cz/research/view_pub.php.en.iso-8859-2?id=11472}
}

Your IPv4 address: 54.81.45.122
Switch to IPv6 connection

DNSSEC [dnssec]