Conference paper

SZŐKE Igor, FAPŠO Michal, ŽIŽKA Josef, BERAN Vítězslav and ČERNOCKÝ Jan. Efektivní přístup ke znalostem v audio-vizuálních záznamech. In: Proceedings of the Annual Database Conference. Praha: The University of Technology Košice, 2012, pp. 57-74. ISBN 978-80-553-1049-7.
Publication language:czech
Original title:Efektivní přístup ke znalostem v audio-vizuálních záznamech
Title (en):Effective access for information in audio-visual recordings
Pages:57-74
Proceedings:Proceedings of the Annual Database Conference
Conference:DATAKON 2012
Place:Praha, CZ
Year:2012
ISBN:978-80-553-1049-7
Publisher:The University of Technology Košice
URL:http://www.fit.vutbr.cz/~szoke/papers/datakon2012.pdf [PDF]
URL:http://www.fit.vutbr.cz/research/groups/speech/publi/2012/szoke_datakon2012_18%20pages.pdf [PDF]
Keywords
audiovisual recording, speech-to-text, image-to-text, indexing and search, web
Annotation
The amount of audiovisual data in growing. Part of the data as lecture or conference recordings contain important information. However this information is hidden and unreachable for standard web crawlers as Google. This paper deals with a system, which makes the information available for standard text based indexers and searchers. It is done by conversion of speech and video into text. Description of the audiovisual indexing and search system is provided in the first part of this paper. We briefly describe the speech-to-text and slide synchronization components. Next, the description of an indexing engine is given. The engine is capable to index not only text but also timing and probability of recognized speech. The second part is aimed at practical issues like user interface and customer feedback.
Abstract
The amount of audiovisual data in growing. Part of the data as lecture or conference recordings contain important information. However this information is hidden and unreachable for standard web crawlers as Google. This paper deals with a system, which makes the information available for standard text based indexers and searchers. It is done by conversion of speech and video into text. Description of the audiovisual indexing and search system is provided in the first part of this paper. We briefly describe the speech-to-text and slide synchronization components. Next, the description of an indexing engine is given. The engine is capable to index not only text but also timing and probability of recognized speech. The second part is aimed at practical issues like user interface and customer feedback.
BibTeX:
@INPROCEEDINGS{
   author = {Igor Sz{\H{o}}ke and Michal Fap{\v{s}}o and Josef
	{\v{Z}}i{\v{z}}ka and V{\'{i}}t{\v{e}}zslav Beran and Jan
	{\v{C}}ernock{\'{y}}},
   title = {Efektivn{\'{i}} p{\v{r}}{\'{i}}stup ke znalostem v
	audio-vizu{\'{a}}ln{\'{i}}ch z{\'{a}}znamech},
   pages = {57--74},
   booktitle = {Proceedings of the Annual Database Conference},
   year = {2012},
   location = {Praha, CZ},
   publisher = {The University of Technology Ko{\v{s}}ice},
   ISBN = {978-80-553-1049-7},
   language = {czech},
   url = {http://www.fit.vutbr.cz/research/view_pub.php.en.iso-8859-2?id=10172}
}

Your IPv4 address: 54.161.73.123
Switch to IPv6 connection

DNSSEC [dnssec]