|
| Reseach leader: | Zemčík Pavel |
| Team leaders: | Burget Lukáš, Černocký Jan |
| Agency: | EU-6FP-IST |
| Code: | IST-033812-AMIDA |
| Start: | 2006 |
| End: | 2009 |
| Keywords: | speech recognition, video processing, teleconference
|
| Annotation: |
AMIDA will develop and expand the research vision that we initiated in the previous (still ongoing)
EU-IST AMI Integrated Project, to understand better and build new support for human communication.
The ground-breaking research that we shall undertake in AMIDA will span several traditionally
separate disciplines, including:
- Qualitative human analysis and human factors;
- Audio-video processing, including unconstrained speech recognition and natural scene analysis;
- Multimodal structure and content analysis, including the modelling of individuals and groups,
through the joint processing of multiple (multimodal) information channels (audio, visual,
slides, handwriting, and white board activity);
- HCI, application prototyping, evaluation, and system integration.
The AMIDA research work will directly build upon the recognized achievements and large multimodal
corpora (becoming a standard reference in the area of multimodal processing) resulting from
AMI. However, there will also be a very challenging shift in emphasis to live meetings with remote
participants, using affordable commodity sensors (such as webcams and cheaper microphones), and
targeting the development of advanced videoconferencing systems featuring new functionalities such
as (1) filtering, searching and browsing; (2) remote monitoring; (3) interactive accelerated playback;
(4) meeting support; and (5) shared context and presence.
While addressing additional scientific challenges (such as real-time processing and processing of
lower quality audio and visual signals), AMIDA has also raised the exploitation transfer potential
through genuine integration of the AMIDA industrial partners collaborating on common prototypes and
applications. Finally, through its "Community of Interest" (CoI)1, AMIDA will also actively engage
beyond the consortium to spread awareness and knowledge. |
Products
|
Preceding projects
| 2004 | Augmented Multi-party Interaction, EU-6FP-IST, 506811-AMI, 2004-2006, completed Research leader: Heřmanský Hynek Team leaders: Burget Lukáš, Černocký Jan, Grézl František, Kadlec Jaroslav, Karafiát Martin, Matějka Pavel, Motlíček Petr, Pečiva Jan, Potúček Igor, Schwarz Petr, Sumec Stanislav, Španěl Michal, Zemčík Pavel |
Publications
| 2010 | Beran, V., Herout, A., Zemčík, P.: On-line Video Synchronization Based on Visual Vocabularies, In: Proceedings of WSCG'10, Plzeň, CZ, ZČU v Plzni, 2010, p. 7, ISBN 978-80-86943-86-2 |
| | Hain, T., Burget, L., Dines, J., Garner, P., N., El, H., A., Huijbregts, M., Karafiát, M., Lincoln, M., Wan, V.: The AMIDA 2009 Meeting Transcription System, In: Proceedings of the 11th Annual Conference of the International Speech Communication Association (INTERSPEECH 2010), Makuhari, Chiba, JP, ISCA, 2010, p. 358-361, ISBN 978-1-61782-123-3, ISSN 1990-9772 |
| | Rose, R., Norouzian, A., Reddy, A., Coy, A., Gupta, V., Karafiát, M.: Subword-based spoken term detection in audio course lectures, In: Proc. International Conference on Acoustics, Speech, and Signal Processing, Dallas, US, IEEESP, 2010, p. 5282-5285, ISBN 978-1-4244-4296-6, ISSN 1520-6149 |
| | Santhosh, K., C., P., Li, H., Tong, R., Matějka, P., Burget, L., Černocký, J.: Tuning phone decoders for language identification, In: Proc. International Conference on Acoustics, Speech, and Signal Processing 2010, Dallas, US, IEEESP, 2010, p. 5010-5013, ISBN 978-1-4244-4296-6, ISSN 1520-6149 |
| 2009 | Beran, V., Juránek, R., Mlích, J., Žák, P., Herout, A., Zemčík, P.: On-Line Object Behaviour Analysis for Surveillance Systems, In: 10th Annual ICT Conference, Nairobi, 2009, p. 5 |
| | Brümmer, N., Burget, L., Glembek, O., Hubeika, V., Jančík, Z., Karafiát, M., Matějka, P., Mikolov, T., Plchot, O., Strasheim, A.: BUT-AGNITIO System Description for NIST Language Recognition Evaluation 2009, In: Proceedings NIST 2009 Language Recognition Evaluation Workshop, Baltimore, Maryland, USA, US, NIST, 2009, p. 1-7 |
| | Burget, L., Fapšo, M., Hubeika, V., Glembek, O., Karafiát, M., Kockmann, M., Matějka, P., Schwarz, P., Černocký, J.: BUT system for NIST 2008 speaker recognition evaluation, In: Proc. Interspeech 2009, Brighton, GB, ISCA, 2009, p. 2335-2338, ISSN 1990-9772 |
| | Burget, L., Matějka, P., Hubeika, V., Černocký, J.: Investigation into variants of Joint Factor Analysis for speaker recognition, In: Proc. Interspeech 2009, Brighton, GB, ISCA, 2009, p. 1263-1266, ISSN 1990-9772 |
| | Garner, P., N., Dines, J., Hain, T., El, H., A., Karafiát, M., Korchagin, D., Lincoln, M., Wan, V., Zhang, L.: Real-Time ASR from Meetings, In: Proc. Interspeech 2009, Brighton, GB, ISCA, 2009, p. 2119-2122, ISSN 1990-9772 |
| | Glembek, O., Burget, L., Dehak, N., Brümmer, N., Kenny, P.: Comparison of Scoring Methods used in Speaker Recognition with Joint Factor Analysis, In: Proc. ICASSP 2009, Taipei, TW, IEEESP, 2009, p. 4, ISBN 978-1-4244-2354-5 |
| | Grézl, F., Karafiát, M., Burget, L.: Investigation into bottle-neck features for meeting speech recognition, In: Proc. Interspeech 2009, Brighton, GB, ISCA, 2009, p. 2947-2950, ISBN 978-1-61567-692-7, ISSN 1990-9772 |
| | Chmelař, P., Beran, V., Herout, A., Hradiš, M., Řezníček, I., Zemčík, P.: Brno University of Technology at TRECVid 2009, In: TRECVID 2009: Participant Notebook Papers and Slides, Gaithersburg, MD, US, NIST, 2009, p. 11 |
| | Karafiát, M.: Study of linear transformations applied to training of cross-domain adapted large vocabulary continuous speech recognition systems, Brno, CZ, 2009, p. 73 |
| | Kockmann, M., Burget, L., Černocký, J.: Brno University of Technology System for Interspeech 2009 Emotion Challenge, In: Proc. Interspeech 2009, Brighton, GB, ISCA, 2009, p. 348-351, ISSN 1990-9772 |
| | Kombrink, S., Burget, L., Matějka, P., Karafiát, M., Heřmanský, H.: Posterior-based Out of Vocabulary Word Detection in Telephone Speech, In: Proc. Interspeech 2009, Brighton, GB, ISCA, 2009, p. 80-83, ISSN 1990-9772 |
| | Mlích, J., Zemčík, P., Jiřík, L.: Trajectory classification using HMMs, In: WSCG 2009 Communication Papers, Plzeň, CZ, ZČU v Plzni, 2009, p. 67-72, ISBN 978-80-86943-94-7 |
| | Mlích, J.: Wiimote Gesture Recognition, In: Proceedings of the 15th Conference and Competition STUDENT EEICT 2009 Volume 4, Brno, CZ, FEKT VUT, 2009, p. 344-349, ISBN 978-80-214-3870-5 |
| | Nijholt, A., Zwiers, J., Pečiva, J.: Mixed reality participants in smart meeting rooms and smart home environments, In: Personal and Ubiquitous Computing, Vol. 2009, No. 1, London, GB, p. 85-94, ISSN 1617-4909 |
| 2008 | Burget, L., Fapšo, M., Hubeika, V., Glembek, O., Karafiát, M., Kockmann, M., Matějka, P., Schwarz, P., Černocký, J.: BUT system description: NIST SRE 2008, In: Proc. 2008 NIST Speaker Recognition Evaluation Workshop, Montreal, CA, NIST, 2008, p. 1-4 |
| | Burget, L., Schwarz, P., Matějka, P., Hannemann, M., Rastrow, A., White, C., Khudanpur, S., Heřmanský, H., Černocký, J.: Combination of strongly and weakly constrained recognizers for reliable detection of OOVs, In: Proc. International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Las Vegas, US, IEEESP, 2008, p. 4, ISBN 1-4244-1484-9 |
| | Glembek, O., Matějka, P., Burget, L., Mikolov, T.: Advances in Phonotactic Language Recognition, In: Proc. Interspeech 2008, Brisbane, AU, ISCA, 2008, p. 4, ISSN 1990-9772 |
| | Herout, A., Beran, V., Hradiš, M., Potúček, I., Zemčík, P., Chmelař, P.: TRECVID 2007 by the Brno Group, In: Proceedings of TRECVID 2007, Gaithersburg, US, NIST, 2008, p. 1-6, ISBN 978-1-59593-780-3 |
| | Herout, A., Kubíček, R., Zemčík, P., Žák, P.: Automatic Video Editing for Multimodal Meetings, In: Proceedings of International Conference on Computer Vision and Graphics 2008, Heidelberg, DE, Springer, 2008, p. 1-12, ISSN 0302-9743 |
| | Hubeika, V., Burget, L., Matějka, P., Schwarz, P.: Discriminative Training and Channel Compensation for Acoustic Language Recognition, In: Proc. Interspeech 2008, Brisbane, AU, ISCA, 2008, p. 4, ISSN 1990-9772 |
| | Chmelař, P., Beran, V., Herout, A., Hradiš, M., Juránek, R., Láník, A., Mlích, J., Navrátil, J., Řezníček, I., Žák, P., Zemčík, P.: Brno University of Technology at TRECVid 2008, In: Proceedings of TRECVID 2008, Gaithersburg, US, NIST, 2008, p. 1-16 |
| | Karafiát, M., Burget, L., Hain, T., Černocký, J.: Discrimininative training of narrow band - wide band adaptated systems for meeting recognition, In: Proc. Interspeech 2008, Brisbane, AU, ISCA, 2008, p. 4, ISSN 1990-9772 |
| | Kockmann, M., Burget, L.: Contour modeling of prosodic and acoustic features for speaker recognition, In: Proc. 2008 IEEE Workshop on Spoken Language Technology, Goa, IN, IEEESP, 2008, p. 4, ISBN 978-1-4244-3472-5 |
| | Kockmann, M., Burget, L.: Syllable based Feature-Contours for Speaker Recognition, In: Proc. 14th International Workshop on Advances in Speech Technology, Maribor, SI, 2008, p. 4 |
| | Matějka, P., Burget, L., Glembek, O., Schwarz, P., Hubeika, V., Fapšo, M., Mikolov, T., Plchot, O., Černocký, J.: BUT language recognition system for NIST 2007 evaluations, In: Proc. Interspeech 2008, Brisbane, Australia, AU, ISCA, 2008, p. 4, ISSN 1990-9772 |
| | Plchot, O., Hubeika, V., Burget, L., Schwarz, P., Matějka, P.: Acquisition of Telephone Data from Radio Broadcasts with Applications to Language Recognition, In: Proc. 11th International Conference on Text, Speech and Dialogue, Berlin, DE, Springer, 2008, p. 477-483, ISBN 978-3-540-87390-7 |
| | Szőke, I., Burget, L., Černocký, J., Fapšo, M.: Sub-word modeling of out of vocabulary words in spoken term detection, In: Proc. 2008 IEEE Workshop on Spoken Language Technology, Goa, IN, IEEESP, 2008, p. 4, ISBN 978-1-4244-3472-5 |
| | Szőke, I., Fapšo, M., Burget, L., Černocký, J.: Hybrid word-subword decoding for spoken term detection, In: Proc. SSCS 2008: Speech search workshop at SIGIR, Singapore, SG, ACM, 2008, p. 4, ISBN 978-90-365-2697-5 |
| 2007 | Brümmer, N., Burget, L., Černocký, J., Glembek, O., Grézl, F., Karafiát, M., van, L., D., Matějka, P., Schwarz, P., Strasheim, A.: Fusion of heterogeneous speaker recognition systems in the STBU submission for the NIST speaker recognition evaluation 2006, In: IEEE Transactions on Audio, Speech, and Language Processing, Vol. 15, No. 7, 2007, US, p. 2072-2084, ISSN 1558-7916 |
| | Burget, L., Matějka, P., Schwarz, P., Glembek, O., Černocký, J.: Analysis of feature extraction and channel compensation in GMM speaker recognition system, In: IEEE Transactions on Audio, Speech, and Language Processing, Vol. 15, No. 7, 2007, US, p. 1979-1986, ISSN 1558-7916 |
| | Černocký, J., Burget, L., Schwarz, P., Matějka, P., Karafiát, M., Glembek, O., Kopecký, J., Szőke, I., Fapšo, M., Grézl, F., Hubeika, V., Oparin, I.: Search in speech, language identification and speaker recognition in Speech@FIT, In: Proc. 17th International Conference Radioelektronika, 2007, Brno, CZ, UREL FEKT VUT, 2007, p. 1-6, ISBN 978-80-214-3390-8 |
| | Černocký, J., Szőke, I., Fapšo, M., Karafiát, M., Burget, L., Kopecký, J., Grézl, F., Schwarz, P., Glembek, O., Oparin, I., Smrž, P., Matějka, P.: Search in speech for public security and defense, In: Proc. IEEE Workshop on Signal Processing Applications for Public Security and Forensics, 2007 (SAFE '07), Washington D.C., US, IEEESP, 2007, p. 1-7, ISBN 1-4244-1226-9 |
| | Fapšo, M.: Search in speech records, In: Proc. 13th Conference STUDENT EEICT 2007, Brno, CZ, FEKT VUT, 2007, p. 1-3, ISBN 978-80-214-3410-3 |
| | Granát, J., Herout, A., Hradiš, M., Zemčík, P.: Hardware Acceleration of AdaBoost Classifier, In: Workshop on Multimodal Interaction and Related Machine Learning Algorithms (MLMI), Brno, CZ, 2007, p. 1-12 |
| | Grézl, F., Karafiát, M., Černocký, J.: Neural network topologies and bottle neck features in speech recognition, Brno, CZ, 2007, p. 5 |
| | Grézl, F., Karafiát, M., Kontár, S., Černocký, J.: Probabilistic and bottle-neck features for LVCSR of meetings, In: Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2007), Hononulu, US, IEEESP, 2007, p. 757-760, ISBN 1-4244-0728-1 |
| | Hain, T., Wan, V., Burget, L., Karafiát, M., Dines, J., Vepa, J., Garau, G., Lincoln, M.: The AMI System for the Transcription of Speech in Meetings, In: Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2007), Hononulu, US, IEEESP, 2007, p. 357-360, ISBN 1-4244-0728-1 |
| | Hubeika, V., Burget, L., Matějka, P., Černocký, J.: Channel Compensation for Speaker Recognition, Brno, CZ, 2007, p. 1-1 |
| | Hubeika, V., Szőke, I., Burget, L., Černocký, J.: Maximum Likelihood and Maximum Mutual Information Training in Gender and Age Recognition System, In: Proc. 10th International Conference on Text Speech and Dialogue (TSD 2007), Pilsen, CZ, Springer, 2007, p. 1-6, ISBN 978-3-540-74627-0 |
| | Karafiát, M., Burget, L., Černocký, J., Hain, T.: Real-Time ASR from Meetings, In: Proc. INTERSPEECH 2007, Antwerpen, BE, ISCA, 2007, p. 4, ISSN 1990-9772 |
| | Matějka, P., Burget, L., Glembek, O., Schwarz, P., Hubeika, V., Fapšo, M., Mikolov, T., Plchot, O.: BUT system description for NIST LRE 2007, In: Proc. 2007 NIST Language Recognition Evaluation Workshop, Orlando, US, NIST, 2007, p. 1-5 |
| | Matějka, P., Burget, L., Schwarz, P., Glembek, O., Karafiát, M., Grézl, F., Černocký, J., van, L., D., Brümmer, N., Strasheim, A.: STBU system for the NIST 2006 speaker recognition evaluation, In: Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2007), Honolulu, US, IEEESP, 2007, p. 221-224, ISBN 1-4244-0728-1 |
| | Potúček, I., Beran, V., Sumec, S., Zemčík, P.: Evaluation and comparison of tracking methods using meeting omnidirectional images, In: Workshop on Multimodal Interaction and Related Machine Learning Algorithms (MLMI), Brno, CZ, 2007, p. 12 |
| | Szőke, I., Burget, L., Karafiát, M.: Combination of Word and Phoneme Approach for Spoken Term Detection, Brno, CZ, 2007, p. 1-1 |
| | Szőke, I., Fapšo, M., Karafiát, M., Burget, L., Grézl, F., Schwarz, P., Glembek, O., Matějka, P., Kopecký, J., Černocký, J.: Spoken Term Detection System Based on a Combination of LVCSR and Phonetic Search, Brno, CZ, 2007, p. 1-1 |
|
|