Publication Details

Audio-Visual Processing in Meetings: Seven Questions and Current AMI Answers

AL-HAMES Marc, HAIN Thomas, ČERNOCKÝ Jan, SCHREIBER Sascha, POEL Mannes, MÜLLER Ronald, MARCEL Sebastien, VAN Leeuwen David, ODOBEZ Jean-Marc, BA Sileye, BOURLARD Herve, CARDINAUX Fabien, GATICA-PEREZ Daniel, JANIN Adam, MOTLÍČEK Petr, REITER Stephan, RENALS Steve, VAN Rest Jeroen, RIENKS Rutger, RIGOLL Gerhard, SMITH Kevin, THEAN Andrew and ZEMČÍK Pavel. Audio-Visual Processing in Meetings: Seven Questions and Current AMI Answers. In: Proc. 3nd Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms (MLMI 2006). Washington D.C., 2006, p. 12.

Czech title

Audiovizuální zpracování meetingů - sedm otázek a odpovědí projektu AMI

Type

conference paper

Language

english

Authors

Al-Hames Marc (TUM)
Hain Thomas (USF)
Černocký Jan, prof. Dr. Ing. (DCGM FIT BUT)
Schreiber Sascha (TUM)
Poel Mannes (UTWENTE)
Müller Ronald (TUM)
Marcel Sebastien (IDIAP)
van Leeuwen David (TNO TPD)
Odobez Jean-Marc (IDIAP)
Ba Sileye (IDIAP)
Bourlard Herve (IDIAP)
Cardinaux Fabien (IDIAP)
Gatica-Perez Daniel (IDIAP)
Janin Adam (ICSI Berkeley)
Motlíček Petr, doc. Ing., Ph.D. (DCGM FIT BUT)
Reiter Stephan (TUM)
Renals Steve (UEDIN)
van Rest Jeroen (TNO TPD)
Rienks Rutger (UTWENTE)
Rigoll Gerhard, Prof. Dr.-Ing. (TUM)
Smith Kevin (IDIAP)
Thean Andrew (TNO TPD)
Zemčík Pavel, prof. Dr. Ing. (DCGM FIT BUT)

URL

http://www.fit.vutbr.cz/~cernocky/publi/2006/wp4_mlmi_final.pdf PDF

Keywords

speech processing, video processing, multi-modal interaction

Abstract

The paper is on Audio-Visual Processing in Meetings: it asks Seven Questions and presents Current AMI Answers

Annotation

The project Augmented Multi-party Interaction (AMI) is concerned with the development of meeting browsers and remote meeting assistants for instrumented meeting rooms - and the required component technologies R and D themes: group dynamics, audio, visual, and multimodal processing, content abstraction, and human-computer interaction. The audio-visual processing workpackage within AMI addresses the automatic recognition from audio, video, and combined audio-video streams, that have been recorded during meetings. In this article we describe the progress that has been made in the first two years of the project. We show how the large problem of audio-visual processing in meetings can be split into seven questions, like "Who is acting during the meeting?". We then show which algorithms and methods have been developed and evaluated for the automatic answering of these questions

Published

2006

Pages

Proceedings

Proc. 3nd Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms (MLMI 2006)

Conference

3nd Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms, Washington, US

Place

Washington D.C., US

BibTeX

@INPROCEEDINGS{FITPUB8237,
   author = "Marc Al-Hames and Thomas Hain and Jan \v{C}ernock\'{y} and Sascha Schreiber and Mannes Poel and Ronald M{\"{u}}ller and Sebastien Marcel and David Leeuwen van and Jean-Marc Odobez and Sileye Ba and Herve Bourlard and Fabien Cardinaux and Daniel Gatica-Perez and Adam Janin and Petr Motl\'{i}\v{c}ek and Stephan Reiter and Steve Renals and Jeroen Rest van and Rutger Rienks and Gerhard Rigoll and Kevin Smith and Andrew Thean and Pavel Zem\v{c}\'{i}k",
   title = "Audio-Visual Processing in Meetings: Seven Questions and Current AMI Answers",
   pages = 12,
   booktitle = "Proc. 3nd Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms (MLMI 2006)",
   year = 2006,
   location = "Washington D.C., US",
   language = "english",
   url = "https://www.fit.vut.cz/research/publication/8237"
}