Department of Computer Graphics and Multimedia

Multilingual recognition and search in speech for electronic dictionaries

Czech title:Multiligvální rozpoznávání a vyhledávání v řeči pro elektronické slovníky
Reseach leader:Černocký Jan
Team leaders:Burget Lukáš, Grézl František, Karafiát Martin, Matějka Pavel, Schwarz Petr, Žižka Josef
Team members:Kubalík Jakub (FIT VUT), Tomášek Pavel (FIT VUT), Veselý Karel (FIT VUT)
Agency:Ministry of Industry and Trade of the Czech Republic
Code:FR-TI1/034
Start:2009-09-01
End:2013-08-31
Keywords:multilinguality, speech recognition, keyword spotting, electronic dictionaries
Annotation:
The proposed project aims at research, development and assessment of technologies for prototyping of speech recognition and search systems with only a few hours of transcribed training data, without the need for phonetic or linguistic expertise. These technologies will be tested in the domain of electronic dictionaries.

Products

2013Multilingual models for speech recognition, software, 2013
Authors: Karafiát Martin, Grézl František, Egorova Ekaterina, Janda Miloš, Černocký Jan
 Prototyping of speech recognizers for new languages, technology, 2013
Authors: Karafiát Martin, Grézl František, Egorova Ekaterina, Janda Miloš, Černocký Jan, Kašpar Michal

Preceding projects

2006Research and development of corpus and speech technologies in new generation of electronic dictionaries, MPO CR, FT-TA3/006, 2006-2009, completed
Research leader: Černocký Jan
Team leaders: Fapšo Michal, Grézl František, Pešán Jan, Schwarz Petr, Szőke Igor

Publications

2013EGOROVA Ekaterina, VESELÝ Karel, KARAFIÁT Martin, JANDA Miloš and ČERNOCKÝ Jan. Manual and Semi-Automatic Approaches to Building a Multilingual Phoneme Set. In: Proceedings of ICASSP 2013. Vancouver: IEEE Signal Processing Society, 2013, pp. 7324-7328. ISBN 978-1-4799-0355-9.
 JANDA Miloš. Automatic Generation Of Pronunciation Dictionaries Based On Diarization. In: Proceedings of the 19th Conference Student EEICT 2013. Brno: Brno University of Technology, 2013, pp. 228-232. ISBN 978-80-214-4695-3.
 SOUFIFAR Mehdi Mohammad, BURGET Lukáš, PLCHOT Oldřich, CUMANI Sandro and ČERNOCKÝ Jan. Regularized Subspace n-Gram Model for Phonotactic iVector Extraction. In: Proceedings of Interspeech 2013. Lyon: International Speech Communication Association, 2013, pp. 74-78. ISBN 978-1-62993-443-3. ISSN 2308-457X.
2012BRUMMER Niko, CUMANI Sandro, GLEMBEK Ondřej, KARAFIÁT Martin, MATĚJKA Pavel, PEŠÁN Jan, PLCHOT Oldřich, SOUFIFAR Mehdi Mohammad, DE Villiers Edward and ČERNOCKÝ Jan. Description and analysis of the Brno276 system for LRE2011. In: Proceedings of Odyssey 2012: The Speaker and Language Recognition Workshop. Singapur: International Speech Communication Association, 2012, pp. 216-223. ISBN 978-981-07-3093-2.
 JANDA Miloš, KARAFIÁT Martin and ČERNOCKÝ Jan. Dealing with Numbers in Grapheme-Based Speech Recognition. In: Proceedings of 15th International Conference on Text, Speech and Dialogue. Springer-Verlag Berlin Heidelberg 2012: Springer Verlag, 2012, pp. 438-445. ISBN 978-3-642-32789-6. ISSN 0302-9743.
 JANDA Miloš. Grapheme Based Speech Recognition. In: Proceedings of the 18th Conference STUDENT EEICT 2012. Brno: Brno University of Technology, 2012, pp. 441-445. ISBN 978-80-214-4460-7.
 KARAFIÁT Martin, JANDA Miloš, ČERNOCKÝ Jan and BURGET Lukáš. Region Dependent Linear Transforms in Multilingual Speech Recognition. In: Proc. International Conference on Acoustics, Speech, and Signal Processing 2012. Kyoto: IEEE Signal Processing Society, 2012, pp. 4885-4888. ISBN 978-1-4673-0044-5.
 KOMBRINK Stefan, MIKOLOV Tomáš, KARAFIÁT Martin and BURGET Lukáš. Improving Language Models for ASR Using Translated In-domain Data. In: Proceedings of 2012 IEEE International Conference on Acoustics, Speech and Signal Processing. Kyoto: IEEE Signal Processing Society, 2012, pp. 4405-4408. ISBN 978-1-4673-0044-5.
 PLCHOT Oldřich, KARAFIÁT Martin, BRUMMER Niko, GLEMBEK Ondřej, MATĚJKA Pavel, DE Villiers Edward and ČERNOCKÝ Jan. Speaker vectors from Subspace Gaussian Mixture Model as complementary features for Language Identification. In: Proceedings of Odyssey 2012, The Speaker and Language Recognition Workshop. Singapur: International Speech Communication Association, 2012, pp. 330-333. ISBN 978-981-07-3093-2.
 SZŐKE Igor, FAPŠO Michal and VESELÝ Karel. BUT2012 Approaches for Spoken Web Search - MediaEval 2012. In: Working Notes Proceedings of the MediaEval 2012 Workshop. Pisa: CEUR-WS.org, 2012, pp. 1-2. ISSN 1613-0073.
 TEJEDOR Javier, FAPŠO Michal, SZŐKE Igor, ČERNOCKÝ Jan and GRÉZL František. Comparison of methods for language-dependent and language-independent query-by-example spoken term detection. ACM Transactions on Information Systems (TOIS). New York: Association for Computing Machinery, 2012, vol. 2012, no. 30, pp. 1-34. ISSN 1046-8188.
 VESELÝ Karel, KARAFIÁT Martin, GRÉZL František, JANDA Miloš and EGOROVA Ekaterina. The Language-Independent Bottleneck Features. In: Proceedings of IEEE 2012 Workshop on Spoken Language Technology. Miami: IEEE Signal Processing Society, 2012, pp. 336-341. ISBN 978-1-4673-5124-9.
2011GRÉZL František, KARAFIÁT Martin and JANDA Miloš. Study of Probabilistic and Bottle-Neck Features in Multilingual Environment. In: Proceedings of ASRU 2011. Hilton Waikoloa Village, Big Island, Hawaii: IEEE Signal Processing Society, 2011, pp. 359-364. ISBN 978-1-4673-0366-8.
 KARAFIÁT Martin, BURGET Lukáš, MATĚJKA Pavel, GLEMBEK Ondřej and ČERNOCKÝ Jan. iVector-Based Discriminative Adaptation for Automatic Speech Recognition. In: Proceedings of ASRU 2011. Hilton Waikoloa Village, Big Island, Hawaii: IEEE Signal Processing Society, 2011, pp. 152-157. ISBN 978-1-4673-0366-8.
 MIKOLOV Tomáš, DEORAS Anoop, POVEY Daniel, BURGET Lukáš and ČERNOCKÝ Jan. Strategies for Training Large Scale Neural Network Language Models. In: Proceedings of ASRU 2011. Hilton Waikoloa Village, Big Island, Hawaii: IEEE Signal Processing Society, 2011, pp. 196-201. ISBN 978-1-4673-0366-8.
 MIKOLOV Tomáš, KOMBRINK Stefan, DEORAS Anoop, BURGET Lukáš and ČERNOCKÝ Jan. RNNLM - Recurrent Neural Network Language Modeling Toolkit. In: Proceedings of ASRU 2011. Hilton Waikoloa Village, Big Island, Hawaii: IEEE Signal Processing Society, 2011, pp. 1-4. ISBN 978-1-4673-0366-8.
 POVEY Daniel, BURGET Lukáš, AGARWAL Mohit, AKYAZI Pinar, GHOSHAL Arnab, GLEMBEK Ondřej, GOEL Nagendra K., KARAFIÁT Martin, RASTROW Ariya, ROSE Richard, SCHWARZ Petr and THOMAS Samuel et al. The subspace Gaussian mixture model-A structured model for speech recognition. Computer Speech and Language. Amsterdam: Elsevier Science, 2011, vol. 25, no. 2, pp. 404-439. ISSN 0885-2308.
 POVEY Daniel, GHOSHAL Arnab, BOULIANNE Gilles, BURGET Lukáš, GLEMBEK Ondřej, GOEL Nagendra K., HANNEMANN Mirko, MOTLÍČEK Petr, QIAN Yanmin, SCHWARZ Petr, SILOVSKÝ Jan, STEMMER Georg and VESELÝ Karel. The Kaldi Speech Recognition Toolkit. In: Proceedings of ASRU 2011. Hilton Waikoloa Village Resort, Hawaii: IEEE Signal Processing Society, 2011, pp. 1-4. ISBN 978-1-4673-0366-8.
 POVEY Daniel, KARAFIÁT Martin, GHOSHAL Arnab and SCHWARZ Petr. A Symmetrization of the Subspace Gaussian Mixture Model. In: Proceedings of 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing. Praha: IEEE Signal Processing Society, 2011, pp. 4504-4507. ISBN 978-1-4577-0537-3.
 VESELÝ Karel, KARAFIÁT Martin and GRÉZL František. Convolutive Bottleneck Network Features for LVCSR. In: Proceedings of ASRU 2011. Big Island, Hawaii: IEEE Signal Processing Society, 2011, pp. 42-47. ISBN 978-1-4673-0366-8.
2010BURGET Lukáš, SCHWARZ Petr, AGARWAL Mohit, AKYAZI Pinar, FENG Kai, GHOSHAL Arnab, GLEMBEK Ondřej, GOEL Nagendra K., KARAFIÁT Martin, POVEY Daniel, RASTROW Ariya, ROSE Richard and THOMAS Samuel. Multilingual acoustic modeling for speech recognition based on Subspace Gaussian Mixture Models. In: Proc. International Conference on Acoustictics, Speech, and Signal Processing. Dallas: IEEE Signal Processing Society, 2010, pp. 4334-4337. ISBN 978-1-4244-4296-6. ISSN 1520-6149.
 GHOSHAL Arnab, POVEY Daniel, AGARWAL Mohit, AKYAZI Pinar, BURGET Lukáš, FENG Kai, GLEMBEK Ondřej, GOEL Nagendra K., KARAFIÁT Martin, RASTROW Ariya, ROSE Richard, SCHWARZ Petr and THOMAS Samuel. A novel estimation of feature-space MLLR for full_covariance models. In: Proc. International Conference on Acoustics, Speech, and Signal Processing. Dallas: IEEE Signal Processing Society, 2010, pp. 4310-4313. ISBN 978-1-4244-4296-6. ISSN 1520-6149.
 GOEL Nagendra K., THOMAS Samuel, AGARWAL Mohit, AKYAZI Pinar, BURGET Lukáš, FENG Kai, GHOSHAL Arnab, GLEMBEK Ondřej, KARAFIÁT Martin, POVEY Daniel, RASTROW Ariya, ROSE Richard and SCHWARZ Petr. Approaches to automatic lexicon learning with limited training examples. In: Proc. International Conference on Acoustics, Speech, and Signal Processing. Dallas: IEEE Signal Processing Society, 2010, pp. 5094-5097. ISBN 978-1-4244-4296-6. ISSN 1520-6149.
 POVEY Daniel, BURGET Lukáš, AGARWAL Mohit, AKYAZI Pinar, FENG Kai, GHOSHAL Arnab, GLEMBEK Ondřej, GOEL Nagendra K., KARAFIÁT Martin, RASTROW Ariya, ROSE Richard, SCHWARZ Petr and THOMAS Samuel. Subspace Gaussian mixture models for speech recognition. In: Proc. International Conference on Acoustics, Speech, and Signal Processing. Dallas: IEEE Signal Processing Society, 2010, pp. 4330-4333. ISBN 978-1-4244-4296-6. ISSN 1520-6149.

Your IPv4 address: 54.198.216.180
Switch to IPv6 connection

DNSSEC [dnssec]