Multilingual recognition and search in speech for electronic dictionaries |
| Reseach leader: | Černocký Jan |
| Team leaders: | Burget Lukáš, Grézl František, Karafiát Martin, Matějka Pavel, Schwarz Petr, Žižka Josef |
| Team members: | Kubalík Jakub, Tomášek Pavel, Veselý Karel |
| Agency: | MPO ČR |
| Code: | FR-TI1/034 |
| Start: | 2009 |
| End: | 2013 |
| Keywords: | multilinguality, speech recognition, keyword spotting, electronic dictionaries
|
| Annotation: |
| The proposed project aims at research, development and assessment of technologies for prototyping of speech recognition and search systems with only a few hours of transcribed training data, without the need for phonetic or linguistic expertise. These technologies will be tested in the domain of electronic dictionaries. |
Preceding projects
Publications
| 2013 | Janda, M.: Automatic Generation Of Pronunciation Dictionaries Based On Diarization, In: Proceedings of the 19th Conference Student EEICT 2013, Brno, CZ, VUT v Brně, 2013, p. 228-232, ISBN 978-80-214-4695-3 |
| 2012 | Brummer, N., Cumani, S., Glembek, O., Karafiát, M., Matějka, P., Pešán, J., Plchot, O., Soufifar, M., de, V., E., Černocký, J.: Description and analysis of the Brno276 system for LRE2011, In: Proceedings of Odyssey 2012: The Speaker and Language Recognition Workshop, Singapur, SG, ISCA, 2012, p. 216-223, ISBN 978-981-07-3093-2 |
| | Janda, M., Karafiát, M., Černocký, J.: Dealing with Numbers in Grapheme-Based Speech Recognition, In: Proceedings of 15th International Conference on Text, Speech and Dialogue, Springer-Verlag Berlin Heidelberg 2012, DE, Springer, 2012, p. 438-445, ISBN 978-3-642-32789-6, ISSN 0302-9743 |
| | Janda, M.: Grapheme Based Speech Recognition, In: Proceedings of the 18th Conference STUDENT EEICT 2012, Brno, CZ, VUT v Brně, 2012, p. 441-445, ISBN 978-80-214-4460-7 |
| | Karafiát, M., Janda, M., Černocký, J., Burget, L.: Region Dependent Linear Transforms in Multilingual Speech Recognition, In: Proc. International Conference on Acoustics, Speech, and Signal Processing 2012, Kyoto, JP, IEEESP, 2012, p. 4885-4888, ISBN 978-1-4673-0044-5 |
| | Kombrink, S., Mikolov, T., Karafiát, M., Burget, L.: Improving Language Models for ASR Using Translated In-domain Data, In: Proceedings of 2012 IEEE International Conference on Acoustics, Speech and Signal Processing, Kyoto, JP, IEEESP, 2012, p. 4405-4408, ISBN 978-1-4673-0044-5 |
| | Plchot, O., Karafiát, M., Brummer, N., Glembek, O., Matějka, P., de, V., E., Černocký, J.: Speaker vectors from Subspace Gaussian Mixture Model as complementary features for Language Identification, In: Proceedings of Odyssey 2012, The Speaker and Language Recognition Workshop, Singapur, SG, ISCA, 2012, p. 330-333, ISBN 978-981-07-3093-2 |
| | Szőke, I., Fapšo, M., Veselý, K.: BUT2012 Approaches for Spoken Web Search - MediaEval 2012, In: CEUR Workshop Proceedings, Vol. 2012, No. 927, DE, p. 1-2, ISSN 1613-0073 |
| | Tejedor, J., Fapšo, M., Szőke, I., Černocký, J., Grézl, F.: Comparison of methods for language-dependent and language-independent query-by-example spoken term detection, In: ACM Transactions on Information Systems (TOIS), Vol. 2012, No. 30, New York, US, p. 1-34, ISSN 1046-8188 |
| | Veselý, K., Karafiát, M., Grézl, F., Janda, M., Egorova, E.: The Language-Independent Bottleneck Features, In: Proceedings of IEEE 2012 Workshop on Spoken Language Technology, Miami, US, IEEESP, 2012, p. 336-341, ISBN 978-1-4673-5124-9 |
| 2011 | Grézl, F., Karafiát, M., Janda, M.: Study of Probabilistic and Bottle-Neck Features in Multilingual Environment, In: Proceedings of ASRU 2011, Hilton Waikoloa Village, Big Island, Hawaii, US, IEEESP, 2011, p. 359-364, ISBN 978-1-4673-0366-8 |
| | Karafiát, M., Burget, L., Matějka, P., Glembek, O., Černocký, J.: iVector-Based Discriminative Adaptation for Automatic Speech Recognition, In: Proceedings of ASRU 2011, Hilton Waikoloa Village, Big Island, Hawaii, US, IEEESP, 2011, p. 152-157, ISBN 978-1-4673-0366-8 |
| | Mikolov, T., Deoras, A., Povey, D., Burget, L., Černocký, J.: Strategies for Training Large Scale Neural Network Language Models, In: Proceedings of ASRU 2011, Hilton Waikoloa Village, Big Island, Hawaii, US, IEEESP, 2011, p. 196-201, ISBN 978-1-4673-0366-8 |
| | Mikolov, T., Kombrink, S., Deoras, A., Burget, L., Černocký, J.: RNNLM - Recurrent Neural Network Language Modeling Toolkit, In: Proceedings of ASRU 2011, Hilton Waikoloa Village, Big Island, Hawaii, US, IEEESP, 2011, p. 1-4, ISBN 978-1-4673-0366-8 |
| | Povey, D., Burget, L., Agarwal, M., Akyazi, P., Ghoshal, A., Glembek, O., Goel, N., K., Karafiát, M., Rastrow, A., Rose, R., Schwarz, P., Thomas, S. et al: The subspace Gaussian mixture model-A structured model for speech recognition, In: Computer Speech and Language, Vol. 25, No. 2, 2011, Amsterdam, NL, p. 404-439, ISSN 0885-2308 |
| | Povey, D., Karafiát, M., Ghoshal, A., Schwarz, P.: A Symmetrization of the Subspace Gaussian Mixture Model, In: Proceedings of 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing, Praha, CZ, IEEESP, 2011, p. 4504-4507, ISBN 978-1-4577-0537-3 |
| | Veselý, K., Karafiát, M., Grézl, F.: Convolutive Bottleneck Network Features for LVCSR, In: Proceedings of ASRU 2011, Big Island, Hawaii, US, IEEESP, 2011, p. 42-47, ISBN 978-1-4673-0366-8 |
| 2010 | Burget, L., Schwarz, P., Agarwal, M., Akyazi, P., Feng, K., Ghoshal, A., Glembek, O., Goel, N., K., Karafiát, M., Povey, D., Rastrow, A., Rose, R., Thomas, S.: Multilingual acoustic modeling for speech recognition based on Subspace Gaussian Mixture Models, In: Proc. International Conference on Acoustictics, Speech, and Signal Processing, Dallas, US, IEEESP, 2010, p. 4334-4337, ISBN 978-1-4244-4296-6, ISSN 1520-6149 |
| | Ghoshal, A., Povey, D., Agarwal, M., Akyazi, P., Burget, L., Feng, K., Glembek, O., Goel, N., K., Karafiát, M., Rastrow, A., Rose, R., Schwarz, P., Thomas, S.: A novel estimation of feature-space MLLR for full_covariance models, In: Proc. International Conference on Acoustics, Speech, and Signal Processing, Dallas, US, IEEESP, 2010, p. 4310-4313, ISBN 978-1-4244-4296-6, ISSN 1520-6149 |
| | Goel, N., K., Thomas, S., Agarwal, M., Akyazi, P., Burget, L., Feng, K., Ghoshal, A., Glembek, O., Karafiát, M., Povey, D., Rastrow, A., Rose, R., Schwarz, P.: Approaches to automatic lexicon learning with limited training examples, In: Proc. International Conference on Acoustics, Speech, and Signal Processing, Dallas, US, IEEESP, 2010, p. 5094-5097, ISBN 978-1-4244-4296-6, ISSN 1520-6149 |
| | Povey, D., Burget, L., Agarwal, M., Akyazi, P., Feng, K., Ghoshal, A., Glembek, O., Goel, N., K., Karafiát, M., Rastrow, A., Rose, R., Schwarz, P., Thomas, S.: Subspace Gaussian mixture models for speech recognition, In: Proc. International Conference on Acoustics, Speech, and Signal Processing, Dallas, US, IEEESP, 2010, p. 4330-4333, ISBN 978-1-4244-4296-6, ISSN 1520-6149 |
|
|