Ing. Igor Szőke, Ph.D.

ANGUERA Xavier, RODRIGUEZ-FUENTES Luis J., SZŐKE Igor, BUZO Andi and METZE Florian et al. Query-by-example Spoken Term Detection Evaluation on Low-resource Languages. In: Proceedings of the 4th International Workshop on Spoken Language Technologies for Under- resourced Languages SLTU-2014. – St. Petersburg, Russia. St. Petersburg: International Speech Communication Association, 2014, pp. 24-31. ISBN 978-5-8088-0908-6.
Publication language:english
Original title:Query-by-example Spoken Term Detection Evaluation on Low-resource Languages
Title (cs):Evaluace vyhledávání v řeči pomocí zadávání vyslovených příkladů na jazycích s omezenými zdroji
Pages:24-31
Proceedings:Proceedings of the 4th International Workshop on Spoken Language Technologies for Under- resourced Languages SLTU-2014. – St. Petersburg, Russia
Conference:The 4th International Workshop on Spoken Language Technologies for Under-resourced Languages (SLTU'14)
Place:St. Petersburg, RU
Year:2014
ISBN:978-5-8088-0908-6
Publisher:International Speech Communication Association
URL:http://www.fit.vutbr.cz/research/groups/speech/publi/2014/anguera_sltu2014_p24_31.pdf [PDF]
Keywords
benchmark evaluation, low-resource languages, query-by-example spoken term detection
Annotation
In this paper, besides presenting the setup, datasets and performance measures of the SWS 2013 evaluation, we have analyzed the results obtained by the submitted systems and presented a postevaluation study where the 10 best-performing systems were incrementally fused (at the score level), obtaining a 30%relative improvement over the best-performing individual system, proving the benefits of combining independent or complementary sources of information or different modeling approaches. Given the increasing interest for this task in the community, we are already planning a new edition of the SWS evaluation, renamed QUESST, i.e. Query by Example Spoken Search Task, within the Mediaeval 2014 benchmark campaign. This year, we will continue tackling the problem of low-resource settings and will introduce a component of variability between queries and references, allowing for a limited amount of acoustic insertions to still be considered matches.
Abstract
As part of the MediaEval 2013 benchmark evaluation campaign, the objective of the Spoken Web Search (SWS) task was to perform Query-by-Example Spoken Term Detection (QbE-STD), using spoken queries to retrieve matching segments in a set of audio files. As in previous editions, the SWS 2013 evaluation focused on the development of technology specifically designed to perform speech search in a low-resource setting. In this paper, we first describe the main features of past SWS evaluations and then focus on the 2013 SWS task, in which a special effort was made to prepare a challenging database, including speech in 9 different languages with diverse environment and channel conditions. The main novelties of the submitted systems are reviewed and performance figures are then presented and discussed, demonstrating the feasibility of the proposed task, even under such challenging conditions. Finally, the fusion of the 10 top-performing systems is analyzed. The best fusion provides a 30% relative improvement over the best single system in the evaluation, which proves that a variety of approaches can be effectively combined to bring complementary information in the search for queries.
BibTeX:
@INPROCEEDINGS{
   author = {Xavier Anguera and J. Luis Rodriguez-Fuentes and Igor
	Sz{\H{o}}ke and Andi Buzo and Florian Metze},
   title = {Query-by-example Spoken Term Detection Evaluation on
	Low-resource Languages},
   pages = {24--31},
   booktitle = {Proceedings of the 4th International Workshop on Spoken
	Language Technologies for Under- resourced Languages
	SLTU-2014. – St. Petersburg, Russia},
   year = {2014},
   location = {St. Petersburg, RU},
   publisher = {International Speech Communication Association},
   ISBN = {978-5-8088-0908-6},
   language = {english},
   url = {http://www.fit.vutbr.cz/research/view_pub.php?id=10628}
}

Your IPv4 address: 107.22.118.242
Switch to IPv6 connection

DNSSEC [dnssec]