Multilingual recognition and search in speech for electronic dictionaries

Reseach leader:Černocký Jan
Team leaders:Burget Lukáš, Grézl František, Karafiát Martin, Matějka Pavel, Schwarz Petr, Žižka Josef
Team members:Kubalík Jakub, Tomášek Pavel, Veselý Karel
Agency:MPO ČR
Code:FR-TI1/034
Start:2009
End:2013
Keywords:multilinguality, speech recognition, keyword spotting, electronic dictionaries
Annotation:
The proposed project aims at research, development and assessment of technologies for prototyping of speech recognition and search systems with only a few hours of transcribed training data, without the need for phonetic or linguistic expertise. These technologies will be tested in the domain of electronic dictionaries.

Preceding projects

2006Research and development of corpus and speech technologies in new generation of electronic dictionaries, MPO ČR, FT-TA3/006, 2006-2009, completed
Research leader: Černocký Jan
Team leaders: Fapšo Michal, Grézl František, Pešán Jan, Schwarz Petr, Szőke Igor

Publications

2013Janda, M.: Automatic Generation Of Pronunciation Dictionaries Based On Diarization, In: Proceedings of the 19th Conference Student EEICT 2013, Brno, CZ, VUT v Brně, 2013, p. 228-232, ISBN 978-80-214-4695-3
2012Brummer, N., Cumani, S., Glembek, O., Karafiát, M., Matějka, P., Pešán, J., Plchot, O., Soufifar, M., de, V., E., Černocký, J.: Description and analysis of the Brno276 system for LRE2011, In: Proceedings of Odyssey 2012: The Speaker and Language Recognition Workshop, Singapur, SG, ISCA, 2012, p. 216-223, ISBN 978-981-07-3093-2
 Janda, M., Karafiát, M., Černocký, J.: Dealing with Numbers in Grapheme-Based Speech Recognition, In: Proceedings of 15th International Conference on Text, Speech and Dialogue, Springer-Verlag Berlin Heidelberg 2012, DE, Springer, 2012, p. 438-445, ISBN 978-3-642-32789-6, ISSN 0302-9743
 Janda, M.: Grapheme Based Speech Recognition, In: Proceedings of the 18th Conference STUDENT EEICT 2012, Brno, CZ, VUT v Brně, 2012, p. 441-445, ISBN 978-80-214-4460-7
 Karafiát, M., Janda, M., Černocký, J., Burget, L.: Region Dependent Linear Transforms in Multilingual Speech Recognition, In: Proc. International Conference on Acoustics, Speech, and Signal Processing 2012, Kyoto, JP, IEEESP, 2012, p. 4885-4888, ISBN 978-1-4673-0044-5
 Kombrink, S., Mikolov, T., Karafiát, M., Burget, L.: Improving Language Models for ASR Using Translated In-domain Data, In: Proceedings of 2012 IEEE International Conference on Acoustics, Speech and Signal Processing, Kyoto, JP, IEEESP, 2012, p. 4405-4408, ISBN 978-1-4673-0044-5
 Plchot, O., Karafiát, M., Brummer, N., Glembek, O., Matějka, P., de, V., E., Černocký, J.: Speaker vectors from Subspace Gaussian Mixture Model as complementary features for Language Identification, In: Proceedings of Odyssey 2012, The Speaker and Language Recognition Workshop, Singapur, SG, ISCA, 2012, p. 330-333, ISBN 978-981-07-3093-2
 Szőke, I., Fapšo, M., Veselý, K.: BUT2012 Approaches for Spoken Web Search - MediaEval 2012, In: CEUR Workshop Proceedings, Vol. 2012, No. 927, DE, p. 1-2, ISSN 1613-0073
 Tejedor, J., Fapšo, M., Szőke, I., Černocký, J., Grézl, F.: Comparison of methods for language-dependent and language-independent query-by-example spoken term detection, In: ACM Transactions on Information Systems (TOIS), Vol. 2012, No. 30, New York, US, p. 1-34, ISSN 1046-8188
 Veselý, K., Karafiát, M., Grézl, F., Janda, M., Egorova, E.: The Language-Independent Bottleneck Features, In: Proceedings of IEEE 2012 Workshop on Spoken Language Technology, Miami, US, IEEESP, 2012, p. 336-341, ISBN 978-1-4673-5124-9
2011Grézl, F., Karafiát, M., Janda, M.: Study of Probabilistic and Bottle-Neck Features in Multilingual Environment, In: Proceedings of ASRU 2011, Hilton Waikoloa Village, Big Island, Hawaii, US, IEEESP, 2011, p. 359-364, ISBN 978-1-4673-0366-8
 Karafiát, M., Burget, L., Matějka, P., Glembek, O., Černocký, J.: iVector-Based Discriminative Adaptation for Automatic Speech Recognition, In: Proceedings of ASRU 2011, Hilton Waikoloa Village, Big Island, Hawaii, US, IEEESP, 2011, p. 152-157, ISBN 978-1-4673-0366-8
 Mikolov, T., Deoras, A., Povey, D., Burget, L., Černocký, J.: Strategies for Training Large Scale Neural Network Language Models, In: Proceedings of ASRU 2011, Hilton Waikoloa Village, Big Island, Hawaii, US, IEEESP, 2011, p. 196-201, ISBN 978-1-4673-0366-8
 Mikolov, T., Kombrink, S., Deoras, A., Burget, L., Černocký, J.: RNNLM - Recurrent Neural Network Language Modeling Toolkit, In: Proceedings of ASRU 2011, Hilton Waikoloa Village, Big Island, Hawaii, US, IEEESP, 2011, p. 1-4, ISBN 978-1-4673-0366-8
 Povey, D., Burget, L., Agarwal, M., Akyazi, P., Ghoshal, A., Glembek, O., Goel, N., K., Karafiát, M., Rastrow, A., Rose, R., Schwarz, P., Thomas, S. et al: The subspace Gaussian mixture model-A structured model for speech recognition, In: Computer Speech and Language, Vol. 25, No. 2, 2011, Amsterdam, NL, p. 404-439, ISSN 0885-2308
 Povey, D., Karafiát, M., Ghoshal, A., Schwarz, P.: A Symmetrization of the Subspace Gaussian Mixture Model, In: Proceedings of 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing, Praha, CZ, IEEESP, 2011, p. 4504-4507, ISBN 978-1-4577-0537-3
 Veselý, K., Karafiát, M., Grézl, F.: Convolutive Bottleneck Network Features for LVCSR, In: Proceedings of ASRU 2011, Big Island, Hawaii, US, IEEESP, 2011, p. 42-47, ISBN 978-1-4673-0366-8
2010Burget, L., Schwarz, P., Agarwal, M., Akyazi, P., Feng, K., Ghoshal, A., Glembek, O., Goel, N., K., Karafiát, M., Povey, D., Rastrow, A., Rose, R., Thomas, S.: Multilingual acoustic modeling for speech recognition based on Subspace Gaussian Mixture Models, In: Proc. International Conference on Acoustictics, Speech, and Signal Processing, Dallas, US, IEEESP, 2010, p. 4334-4337, ISBN 978-1-4244-4296-6, ISSN 1520-6149
 Ghoshal, A., Povey, D., Agarwal, M., Akyazi, P., Burget, L., Feng, K., Glembek, O., Goel, N., K., Karafiát, M., Rastrow, A., Rose, R., Schwarz, P., Thomas, S.: A novel estimation of feature-space MLLR for full_covariance models, In: Proc. International Conference on Acoustics, Speech, and Signal Processing, Dallas, US, IEEESP, 2010, p. 4310-4313, ISBN 978-1-4244-4296-6, ISSN 1520-6149
 Goel, N., K., Thomas, S., Agarwal, M., Akyazi, P., Burget, L., Feng, K., Ghoshal, A., Glembek, O., Karafiát, M., Povey, D., Rastrow, A., Rose, R., Schwarz, P.: Approaches to automatic lexicon learning with limited training examples, In: Proc. International Conference on Acoustics, Speech, and Signal Processing, Dallas, US, IEEESP, 2010, p. 5094-5097, ISBN 978-1-4244-4296-6, ISSN 1520-6149
 Povey, D., Burget, L., Agarwal, M., Akyazi, P., Feng, K., Ghoshal, A., Glembek, O., Goel, N., K., Karafiát, M., Rastrow, A., Rose, R., Schwarz, P., Thomas, S.: Subspace Gaussian mixture models for speech recognition, In: Proc. International Conference on Acoustics, Speech, and Signal Processing, Dallas, US, IEEESP, 2010, p. 4330-4333, ISBN 978-1-4244-4296-6, ISSN 1520-6149

Your IPv4 address: 54.234.180.187
Switch to IPv6 connection

DNSSEC [dnssec]