Doc. Ing. Lukáš Burget, Ph.D.

MOTLÍČEK Petr, BURGET Lukáš and ČERNOCKÝ Jan. Non-parametric Speaker Turn Segmentation of Meeting Data. In: Interspeech'2005 - Eurospeech - 9th European Conference on Speech Communication and Technology. Lisabon: International Speech Communication Association, 2005, pp. 657-660. ISSN 1018-4074.
Publication language:english
Original title:Non-parametric Speaker Turn Segmentation of Meeting Data
Title (cs):Neparametrická detekce mluvčího meetingových dat
Pages:657-660
Proceedings:Interspeech'2005 - Eurospeech - 9th European Conference on Speech Communication and Technology
Conference:Eurospeech 2005 - Lisboa 9th European conference on speech communication and technology
Place:Lisabon, PT
Year:2005
Journal:European Speech Communication, Vol. 2005, No. 9, CZ
ISSN:1018-4074
Publisher:International Speech Communication Association
URL:http://www.fit.vutbr.cz/~motlicek/publi/2005/eurospeech_2005.pdf [PDF]
Keywords
speech processing, feature extraction, speaker detection, meeting data
Annotation
This paper describes the non-parametric Speaker Turn Segmentation extracted from the meeting Data.
Abstract
An extension of conventional speaker segmentation framework is presented for a scenario in which a number of microphones record the activity of speakers present at a meeting (one microphone per speaker). Although each microphone can receive speech from both the participant wearing the microphone (local speech) and other participants (cross-talk), the recorded audio can be broadly classified in three ways: local speech, cross-talk, and silence. This paper proposes a technique which takes into account cross-correlations, values of its maxima, and energy differences as features to identify and segment speaker turns. In particular, we have used classical cross-correlation functions, time smoothing and in part temporal constraints to sharpen and disambiguate timing differences between microphone channels that may be dominated by noise and reverberation. Experimental results show that proposed technique can be successively used for speaker segmentation of data collected from a number of different setups.
BibTeX:
@INPROCEEDINGS{
   author = {Petr Motl{\'{i}}{\v{c}}ek and Luk{\'{a}}{\v{s}} Burget and
	Jan {\v{C}}ernock{\'{y}}},
   title = {Non-parametric Speaker Turn Segmentation of Meeting Data},
   pages = {657--660},
   booktitle = {Interspeech'2005 - Eurospeech - 9th European Conference on
	Speech Communication and Technology},
   journal = {European Speech Communication},
   volume = {2005},
   number = {9},
   year = {2005},
   location = {Lisabon, PT},
   publisher = {International Speech Communication Association},
   ISSN = {1018-4074},
   language = {english},
   url = {http://www.fit.vutbr.cz/research/view_pub.php?id=7978}
}

Your IPv4 address: 54.145.83.79
Switch to IPv6 connection

DNSSEC [dnssec]