Department of Computer Graphics and Multimedia
Theory and applications of phoneme posterior estimation in speech processing |
| Reseach leader: | Grézl František |
| Team leaders: | Kopecký Jiří, Plchot Oldřich |
| Agency: | GAČR |
| Code: | GP102/09/P635 |
| Start: | 2009 |
| End: | 2011 |
| Keywords: | speech processing, speech recognition, phoneme recognition, probabilistic features |
| Annotation: |
| Estimation of posterior probabilities of discrete speech units - phonemes - has significant importance in basic speech processing research. The estimates are used in feature extraction (posterior features), phonotactic models for language recognition, generation of phoneme lattices for keyword spotting, and in other applications. The goal of this project is to create a fast and reliable system for estimation of such posterior probabilities that would allow to decrease error rates of the target systems. The project will deal with feature extraction, discriminative transforms, architectures of classifiers and techniques of training. The quality will be assessed mainly in international evaluations organized by US National Institute of Standards and Technology (NIST). |
Publications
| 2012 | Hain Thomas, Burget Lukáš, Dines John, Garner Phillip N., Grézl František, El Hannani Asmaa, Huijbregts Marijn, Karafiát Martin, Lincoln Mike, Wan Vincent: Transcribing Meetings with the AMIDA System, In: IEEE Transactions on Audio, Speech, and Language Processing, Vol. 20, No. 2, 2012, US, p. 486-498, ISSN 1558-7916 |
| 2011 | Bořil Hynek, Grézl František, Hansen John H.: Front-End Compensation Methods for LVCSR Under Lombard Effect, In: Proceedings of Interspeech 2011, Florence, IT, ISCA, 2011, p. 1257-1260, ISBN 978-1-61839-270-1, ISSN 1990-9772 |
| | Grézl František, Karafiát Martin, Janda Miloš: Study of Probabilistic and Bottle-Neck Features in Multilingual Environment, In: Proceedings of ASRU 2011, Hilton Waikoloa Village, Big Island, Hawaii, US, IEEESP, 2011, p. 359-364, ISBN 978-1-4673-0366-8 |
| | Grézl František, Karafiát Martin: Integrating recent MLP feature extraction techniques into TRAP architecture, In: Proceedings of Interspeech 2011, Florence, IT, ISCA, 2011, p. 1229-1232, ISBN 978-1-61839-270-1, ISSN 1990-9772 |
| | Grézl František: The Role of Neural Network Size in TRAP/HATS Feature Extraction, In: Proceedings Text, Speech and Dialogue 2011, Plzeň, CZ, Springer, 2011, p. 315-322, ISBN 978-3-642-23537-5, ISSN 0302-9743 |
| | Kockmann Marcel, Ferrer Luciana, Burget Lukáš, Černocký Jan: iVector Fusion of Prosodic and Cepstral Features for Speaker Verification, In: Proceedings of Interspeech 2011, Florence, IT, ISCA, 2011, p. 265-268, ISBN 978-1-61839-270-1, ISSN 1990-9772 |
| | Kombrink Stefan, Mikolov Tomáš, Karafiát Martin, Burget Lukáš: Recurrent Neural Network based Language Modeling in Meeting Recognition, In: Proceedings of Interspeech 2011, Florence, IT, ISCA, 2011, p. 2877-2880, ISBN 978-1-61839-270-1, ISSN 1990-9772 |
| | Mikolov Tomáš, Deoras Anoop, Kombrink Stefan, Burget Lukáš, Černocký Jan: Empirical Evaluation and Combination of Advanced Language Modeling Techniques, In: Proceedings of Interspeech 2011, Florence, IT, ISCA, 2011, p. 605-608, ISBN 978-1-61839-270-1, ISSN 1990-9772 |
| | Veselý Karel, Karafiát Martin, Grézl František: Convolutive Bottleneck Network Features for LVCSR, In: Proceedings of ASRU 2011, Big Island, Hawaii, US, IEEESP, 2011, p. 42-47, ISBN 978-1-4673-0366-8 |
| 2010 | Grézl František, Karafiát Martin: Hierarchical Neural Net Architectures for Feature Extraction in ASR, In: Proceedings of the 11th Annual Conference of the International Speech Communication Association (INTERSPEECH 2010), Makuhari, Chiba, JP, ISCA, 2010, p. 1201-1204, ISBN 978-1-61782-123-3, ISSN 1990-9772 |
| | Hain Thomas, Burget Lukáš, Dines John, Garner Phillip N., El Hannani Asmaa, Huijbregts Marijn, Karafiát Martin, Lincoln Mike, Wan Vincent: The AMIDA 2009 Meeting Transcription System, In: Proceedings of the 11th Annual Conference of the International Speech Communication Association (INTERSPEECH 2010), Makuhari, Chiba, JP, ISCA, 2010, p. 358-361, ISBN 978-1-61782-123-3, ISSN 1990-9772 |
| | Szőke Igor, Grézl František, Černocký Jan, Fapšo Michal: Acoustic keyword spotter - optimization from end-user perspective, In: Proceedings of the 2010 IEEE Spoken Language Technology Workshop, Berkeley, California, US, IEEESP, 2010, p. 177-181, ISBN 978-1-4244-7902-3 |
| 2009 | Grézl František, Černocký Jan: Audio Surveillance through Known Event Classification, In: Radioengineering, Vol. 18, No. 4, 2009, CZ, p. 671-675, ISSN 1210-2512 |
| | Grézl František, Karafiát Martin, Burget Lukáš: Investigation into bottle-neck features for meeting speech recognition, In: Proc. Interspeech 2009, Brighton, GB, ISCA, 2009, p. 2947-2950, ISBN 978-1-61567-692-7, ISSN 1990-9772 |
|
|