Publication Details

The Development of the AMI System for the Transcription of Speech in Meetings

HAIN Thomas, KARAFIÁT Martin, DINES John, MCCOWAN Iain, LINCOLN Mike, GARAU Giulia, WAN Vincent, ORDELMAN Roeland and RENALS Steve. The Development of the AMI System for the Transcription of Speech in Meetings. In: Machine Learning for Multimodal Interaction, Second International Workshop, MLMI 2005, Edinburgh, UK, July 11-13, 2005, Revised Selected Papers. Lecture Notes in Computer Science Volume 3869, Springer 2006. Edinburgh: University of Edinburgh, 2005, pp. 344-356. ISBN 978-3-540-32549-9.
Czech title
Vývoj AMI systému pro transkripci řečových meetingů
Type
conference paper
Language
english
Authors
Hain Thomas (USF)
Karafiát Martin, Ing., Ph.D. (DCGM FIT BUT)
Dines John (IDIAP)
McCowan Iain (IDIAP)
Lincoln Mike (IDIAP)
Garau Giulia (UEDIN)
Wan Vincent (USF)
Ordelman Roeland (UTWENTE)
Renals Steve (UEDIN)
URL
Keywords

speech recognition, LVCSR, speech processing, signal processing, HMM, Language modeling, meeting transcriptions

Abstract

This paper describes the development of a baseline automatic speech transcription system for meetings in the context of the AMI (Augmented Multiparty Interaction) project.

Annotation

The automatic processing of speech collected in conference
style meetings has attracted considerable interest with several large scale projects devoted to this area. This paper describes the development of a baseline automatic speech transcription system for meetings in the context of the AMI (Augmented Multiparty Interaction) project. We present several techniques important to processing of this data and show the performance in terms of word error rates (WERs). An important aspect of transcription of this data is the necessary flexibility in terms of
audio pre-processing. Real world systems have to deal with flexible input,for example by using microphone arrays or randomly placed microphones in a room. Automatic segmentation and microphone array processing techniques are described and the effect on WERs is discussed. The system and its components presented in this paper yield compettive performance
and form a baseline for future research in this domain.

Published
2005
Pages
344-356
Proceedings
Machine Learning for Multimodal Interaction, Second International Workshop, MLMI 2005, Edinburgh, UK, July 11-13, 2005, Revised Selected Papers
Series
Lecture Notes in Computer Science Volume 3869, Springer 2006
Conference
2nd Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms, Edinburgh, GB
ISBN
978-3-540-32549-9
Publisher
University of Edinburgh
Place
Edinburgh, GB
BibTeX
@INPROCEEDINGS{FITPUB7930,
   author = "Thomas Hain and Martin Karafi\'{a}t and John Dines and Iain McCowan and Mike Lincoln and Giulia Garau and Vincent Wan and Roeland Ordelman and Steve Renals",
   title = "The Development of the AMI System for the Transcription of Speech in Meetings",
   pages = "344--356",
   booktitle = "Machine Learning for Multimodal Interaction, Second International Workshop, MLMI 2005, Edinburgh, UK, July 11-13, 2005, Revised Selected Papers",
   series = "Lecture Notes in Computer Science Volume 3869, Springer 2006",
   year = 2005,
   location = "Edinburgh, GB",
   publisher = "University of Edinburgh",
   ISBN = "978-3-540-32549-9",
   language = "english",
   url = "https://www.fit.vut.cz/research/publication/7930"
}
Back to top