Department of Computer Graphics and Multimedia


Technologies of speech processing for efficient human-machine communication

Reseach leader:Cernocký Jan
Team leaders:Hannemann Mirko, Hermanský Hynek, Szoke Igor, Zizka Josef
Agency:TACR
Code:TA01011328
Start:2011
End:2014
Keywords:speech recognition, electronic dictionaries, defense and security, mobile devices, dialogue systems, CRM, eLearning
Annotation:
Project aims at development of advanced techniques in speech recognition and their deployment in the functional applications: search in electronic dictionaries on mobile devices, dictating translations, in defense and security, in dialogue systems, in client-care systems (CRM, helpdesk etc.) and in audio-visual access to teaching materials.

Products

2012KALDI speech recognition toolkit, software, 2012
Authors: Povey Daniel, Ghoshal Arnab, Boulianne Gilles, Burget Lukás, Glembek Ondrej, Goel Nagendra K., Hannemann Mirko, Motlícek Petr, Qian Yanmin, Schwarz Petr, Silovský Jan, Stemmer Georg, Veselý Karel

Publications

2012Cumani Sandro, Plchot Oldrich, Karafiát Martin: Independent Component Analysis and MLLR Transforms for Speaker Identification, In: Proc. International Conference on Acoustics, Speech, and Signal P, Kyoto, JP, IEEESP, 2012, p. 4365-4368, ISBN 978-1-4673-0044-5
 Deoras Anoop, Mikolov Tomás, Kombrink Stefan, Church Kenneth: Approximate inference: A sampling based modeling technique to capture complex dependencies in a language model, In: Speech Communication, Vol. 2012, No. 8, Amsterdam, NL, p. 1-16, ISSN 0167-6393
 Karafiát Martin, Janda Milos, Cernocký Jan, Burget Lukás: Region Dependent Linear Transforms in Multilingual Speech Recognition, In: Proc. International Conference on Acoustics, Speech, and Signal Processing 2012, Kyoto, JP, IEEESP, 2012, p. 4885-4888, ISBN 978-1-4673-0044-5
 Kombrink Stefan, Mikolov Tomás, Karafiát Martin, Burget Lukás: Improving Language Models for ASR Using Translated In-domain Data, In: Proceedings of 2012 IEEE International Conference on Acoustics, Speech and Signal Processing, Kyoto, JP, IEEESP, 2012, p. 4405-4408, ISBN 978-1-4673-0044-5
 Povey Daniel, Hannemann Mirko, Boulianne Gilles, Burget Lukás, Ghoshal Arnab, Janda Milos, Karafiát Martin, Kombrink Stefan, Motlícek Petr, Qian Yanmin, Riedhammer Korbinian, Veselý Karel, Vu Ngoc Thang: Generating Exact Lattices in The WFST Framework, In: Proceedings of 2012 IEEE International Conference on Acoustics, Speech and Signal Processing, Kyoto, JP, IEEESP, 2012, p. 4213-4216, ISBN 978-1-4673-0044-5
 Rath Shakti P., Karafiát Martin, Glembek Ondrej, Cernocký Jan: A factorized representation of FMLLR transform based on QR-decomposition, In: Proceedings of Interspeech 2012, Portland, Oregon, US, ISCA, 2012, p. 1-4, ISBN 978-1-62276-759-5, ISSN 1990-9772
 Soufifar Mehdi, Cumani Sandro, Burget Lukás, Cernocký Jan: Discriminative Classifiers for Phonotactic Language Recognition with iVectors, In: Proc. International Conference on Acoustics, Speech, and Signal Processing 2012, Kyoto, JP, IEEESP, 2012, p. 4853-4856, ISBN 978-1-4673-0044-5
 Szoke Igor, Fapso Michal, Veselý Karel: BUT2012 Approaches for Spoken Web Search - MediaEval 2012, In: CEUR Workshop Proceedings, Vol. 2012, No. 927, DE, p. 1-2, ISSN 1613-0073
 Szoke Igor, Fapso Michal, Zizka Josef, Beran Vítezslav, Cernocký Jan: Efektivní prístup ke znalostem v audio-vizuálních záznamech, In: Proceedings of the Annual Database Conference, Praha, CZ, TU v Kosiciach, 2012, p. 57-74, ISBN 978-80-553-1049-7
 Veselý Karel, Karafiát Martin, Grézl Frantisek, Janda Milos, Egorova Ekaterina: The Language-Independent Bottleneck Features, In: Proceedings of IEEE 2012 Workshop on Spoken Language Technology, Miami, US, IEEESP, 2012, p. 336-341, ISBN 978-1-4673-5124-9
2011Deoras Anoop, Mikolov Tomás, Church Kenneth: A Fast Re-scoring Strategy to Capture Long-Distance Dependencies, In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing July 2011 Edinburgh, Scotland, UK, Edinburgh, GB, ACL, 2011, p. 1116-1127, ISBN 978-1-937284-11-4
 Grézl Frantisek, Karafiát Martin: Integrating recent MLP feature extraction techniques into TRAP architecture, In: Proceedings of Interspeech 2011, Florence, IT, ISCA, 2011, p. 1229-1232, ISBN 978-1-61839-270-1, ISSN 1990-9772
 Grézl Frantisek: The Role of Neural Network Size in TRAP/HATS Feature Extraction, In: Proceedings Text, Speech and Dialogue 2011, Plzen, CZ, Springer, 2011, p. 315-322, ISBN 978-3-642-23537-5, ISSN 0302-9743
 Karafiát Martin, Burget Lukás, Matejka Pavel, Glembek Ondrej, Cernocký Jan: iVector-Based Discriminative Adaptation for Automatic Speech Recognition, In: Proceedings of ASRU 2011, Hilton Waikoloa Village, Big Island, Hawaii, US, IEEESP, 2011, p. 152-157, ISBN 978-1-4673-0366-8
 Kombrink Stefan, Mikolov Tomás, Karafiát Martin, Burget Lukás: Recurrent Neural Network based Language Modeling in Meeting Recognition, In: Proceedings of Interspeech 2011, Florence, IT, ISCA, 2011, p. 2877-2880, ISBN 978-1-61839-270-1, ISSN 1990-9772
 Mikolov Tomás, Deoras Anoop, Kombrink Stefan, Burget Lukás, Cernocký Jan: Empirical Evaluation and Combination of Advanced Language Modeling Techniques, In: Proceedings of Interspeech 2011, Florence, IT, ISCA, 2011, p. 605-608, ISBN 978-1-61839-270-1, ISSN 1990-9772
 Mikolov Tomás, Deoras Anoop, Povey Daniel, Burget Lukás, Cernocký Jan: Strategies for Training Large Scale Neural Network Language Models, In: Proceedings of ASRU 2011, Hilton Waikoloa Village, Big Island, Hawaii, US, IEEESP, 2011, p. 196-201, ISBN 978-1-4673-0366-8
 Mikolov Tomás, Kombrink Stefan, Deoras Anoop, Burget Lukás, Cernocký Jan: RNNLM - Recurrent Neural Network Language Modeling Toolkit, In: Proceedings of ASRU 2011, Hilton Waikoloa Village, Big Island, Hawaii, US, IEEESP, 2011, p. 1-4, ISBN 978-1-4673-0366-8
 Veselý Karel, Karafiát Martin, Grézl Frantisek: Convolutive Bottleneck Network Features for LVCSR, In: Proceedings of ASRU 2011, Big Island, Hawaii, US, IEEESP, 2011, p. 42-47, ISBN 978-1-4673-0366-8

Your IPv4 address: 72.44.48.122
Switch to IPv6 connection

DNSSEC [dnssec]