Department of Computer Graphics and Multimedia


Technologies of speech processing for efficient human-machine communication

Reseach leader:Černocký Jan
Team leaders:Hannemann Mirko, Heřmanský Hynek, Szőke Igor, Žižka Josef
Agency:TAČR
Code:TA01011328
Start:2011
End:2014
Keywords:speech recognition, electronic dictionaries, defense and security, mobile devices, dialogue systems, CRM, eLearning
Annotation:
Project aims at development of advanced techniques in speech recognition and their deployment in the functional applications: search in electronic dictionaries on mobile devices, dictating translations, in defense and security, in dialogue systems, in client-care systems (CRM, helpdesk etc.) and in audio-visual access to teaching materials.

Products

2012KALDI speech recognition toolkit, software, 2012
Authors: Povey Daniel, Ghoshal Arnab, Boulianne Gilles, Burget Lukáš, Glembek Ondřej, Goel Nagendra K., Hannemann Mirko, Motlíček Petr, Qian Yanmin, Schwarz Petr, Silovský Jan, Stemmer Georg, Veselý Karel

Publications

2013Egorova Ekaterina, Veselý Karel, Karafiát Martin, Janda Miloš, Černocký Jan: Manual and Semi-Automatic Approaches to Building a Multilingual Phoneme Set, In: Proceedings of ICASSP 2013, Vancouver, CA, IEEESP, 2013, p. 7324-7328, ISBN 978-1-4799-0355-9
 Lei Yun, Burget Lukáš, Scheffer Nicolas: A Noise Robust I-Vector Extractor Using Vector Taylor Series For Speaker Recognition, In: Proceedings of ICASSP 2013, Vancouver, CA, IEEESP, 2013, p. 6788-6791, ISBN 978-1-4799-0355-9
 Plchot Oldřich, Matsoukas Spyros, Matějka Pavel, Dehak Najim, Ma Jeff, Cumani Sandro, Glembek Ondřej, Heřmanský Hynek, Mesgarani Nima, Soufifar Mehdi, Thomas Samuel, Zhang Bing, Zhou Xinhui: Developing A Speaker Identification System For The DARPA RATS Project, In: Proceedings of ICASSP 2013, Vancouver, CA, IEEESP, 2013, p. 6768-6772, ISBN 978-1-4799-0355-9
2012Cumani Sandro, Plchot Oldřich, Karafiát Martin: Independent Component Analysis and MLLR Transforms for Speaker Identification, In: Proc. International Conference on Acoustics, Speech, and Signal P, Kyoto, JP, IEEESP, 2012, p. 4365-4368, ISBN 978-1-4673-0044-5
 Deoras Anoop, Mikolov Tomáš, Kombrink Stefan, Church Kenneth: Approximate inference: A sampling based modeling technique to capture complex dependencies in a language model, In: Speech Communication, Vol. 2012, No. 8, Amsterdam, NL, p. 1-16, ISSN 0167-6393
 Karafiát Martin, Janda Miloš, Černocký Jan, Burget Lukáš: Region Dependent Linear Transforms in Multilingual Speech Recognition, In: Proc. International Conference on Acoustics, Speech, and Signal Processing 2012, Kyoto, JP, IEEESP, 2012, p. 4885-4888, ISBN 978-1-4673-0044-5
 Kombrink Stefan, Mikolov Tomáš, Karafiát Martin, Burget Lukáš: Improving Language Models for ASR Using Translated In-domain Data, In: Proceedings of 2012 IEEE International Conference on Acoustics, Speech and Signal Processing, Kyoto, JP, IEEESP, 2012, p. 4405-4408, ISBN 978-1-4673-0044-5
 Povey Daniel, Hannemann Mirko, Boulianne Gilles, Burget Lukáš, Ghoshal Arnab, Janda Miloš, Karafiát Martin, Kombrink Stefan, Motlíček Petr, Qian Yanmin, Riedhammer Korbinian, Veselý Karel, Vu Ngoc Thang: Generating Exact Lattices in The WFST Framework, In: Proceedings of 2012 IEEE International Conference on Acoustics, Speech and Signal Processing, Kyoto, JP, IEEESP, 2012, p. 4213-4216, ISBN 978-1-4673-0044-5
 Rath Shakti P., Karafiát Martin, Glembek Ondřej, Černocký Jan: A factorized representation of FMLLR transform based on QR-decomposition, In: Proceedings of Interspeech 2012, Portland, Oregon, US, ISCA, 2012, p. 1-4, ISBN 978-1-62276-759-5, ISSN 1990-9772
 Soufifar Mehdi, Cumani Sandro, Burget Lukáš, Černocký Jan: Discriminative Classifiers for Phonotactic Language Recognition with iVectors, In: Proc. International Conference on Acoustics, Speech, and Signal Processing 2012, Kyoto, JP, IEEESP, 2012, p. 4853-4856, ISBN 978-1-4673-0044-5
 Szőke Igor, Fapšo Michal, Veselý Karel: BUT2012 Approaches for Spoken Web Search - MediaEval 2012, In: CEUR Workshop Proceedings, Vol. 2012, No. 927, DE, p. 1-2, ISSN 1613-0073
 Szőke Igor, Fapšo Michal, Žižka Josef, Beran Vítězslav, Černocký Jan: Efektivní přístup ke znalostem v audio-vizuálních záznamech, In: Proceedings of the Annual Database Conference, Praha, CZ, TU v Košiciach, 2012, p. 57-74, ISBN 978-80-553-1049-7
 Veselý Karel, Karafiát Martin, Grézl František, Janda Miloš, Egorova Ekaterina: The Language-Independent Bottleneck Features, In: Proceedings of IEEE 2012 Workshop on Spoken Language Technology, Miami, US, IEEESP, 2012, p. 336-341, ISBN 978-1-4673-5124-9
2011Deoras Anoop, Mikolov Tomáš, Church Kenneth: A Fast Re-scoring Strategy to Capture Long-Distance Dependencies, In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing July 2011 Edinburgh, Scotland, UK, Edinburgh, GB, ACL, 2011, p. 1116-1127, ISBN 978-1-937284-11-4
 Grézl František, Karafiát Martin: Integrating recent MLP feature extraction techniques into TRAP architecture, In: Proceedings of Interspeech 2011, Florence, IT, ISCA, 2011, p. 1229-1232, ISBN 978-1-61839-270-1, ISSN 1990-9772
 Grézl František: The Role of Neural Network Size in TRAP/HATS Feature Extraction, In: Proceedings Text, Speech and Dialogue 2011, Plzeň, CZ, Springer, 2011, p. 315-322, ISBN 978-3-642-23537-5, ISSN 0302-9743
 Karafiát Martin, Burget Lukáš, Matějka Pavel, Glembek Ondřej, Černocký Jan: iVector-Based Discriminative Adaptation for Automatic Speech Recognition, In: Proceedings of ASRU 2011, Hilton Waikoloa Village, Big Island, Hawaii, US, IEEESP, 2011, p. 152-157, ISBN 978-1-4673-0366-8
 Kombrink Stefan, Mikolov Tomáš, Karafiát Martin, Burget Lukáš: Recurrent Neural Network based Language Modeling in Meeting Recognition, In: Proceedings of Interspeech 2011, Florence, IT, ISCA, 2011, p. 2877-2880, ISBN 978-1-61839-270-1, ISSN 1990-9772
 Mikolov Tomáš, Deoras Anoop, Kombrink Stefan, Burget Lukáš, Černocký Jan: Empirical Evaluation and Combination of Advanced Language Modeling Techniques, In: Proceedings of Interspeech 2011, Florence, IT, ISCA, 2011, p. 605-608, ISBN 978-1-61839-270-1, ISSN 1990-9772
 Mikolov Tomáš, Deoras Anoop, Povey Daniel, Burget Lukáš, Černocký Jan: Strategies for Training Large Scale Neural Network Language Models, In: Proceedings of ASRU 2011, Hilton Waikoloa Village, Big Island, Hawaii, US, IEEESP, 2011, p. 196-201, ISBN 978-1-4673-0366-8
 Mikolov Tomáš, Kombrink Stefan, Deoras Anoop, Burget Lukáš, Černocký Jan: RNNLM - Recurrent Neural Network Language Modeling Toolkit, In: Proceedings of ASRU 2011, Hilton Waikoloa Village, Big Island, Hawaii, US, IEEESP, 2011, p. 1-4, ISBN 978-1-4673-0366-8
 Veselý Karel, Karafiát Martin, Grézl František: Convolutive Bottleneck Network Features for LVCSR, In: Proceedings of ASRU 2011, Big Island, Hawaii, US, IEEESP, 2011, p. 42-47, ISBN 978-1-4673-0366-8

Your IPv4 address: 184.73.7.143
Switch to IPv6 connection

DNSSEC [dnssec]