Článek ve sborníku konference

BERAN Vítězslav, HRADIŠ Michal, OTRUSINA Lubomír a ŘEZNÍČEK Ivo. Brno University of Technology at TRECVid 2011. In: TRECVID 2011: Participant Notebook Papers and Slides. Gaithersburg, MD: United States Department of Commerce, National Institute of Standards and Technology, 2011, s. 10.
Jazyk publikace:angličtina
Název publikace:Brno University of Technology at TRECVid 2011
Strany:10
Sborník:TRECVID 2011: Participant Notebook Papers and Slides
Konference:2011 TRECVID Workshop
Místo vydání:Gaithersburg, MD, US
Rok:2011
Vydavatel:United States Department of Commerce, National Institute of Standards and Technology
URL:http://www-nlpir.nist.gov/projects/tvpubs/tv11.papers/brno.pdf [PDF]
Soubory: 
+Typ Jméno Název Vel. Poslední změna
iconBrno University of Technology at TRECVid 2011.pdfBrno University of Technology at TRECVid 2011671 KB2011-12-24 14:09:08
^ Vybrat vše
S vybranými:
Klíčová slova
TRECVID, semantic indexing, Content-based Copy Detection, image classification
Anotace
This paper describes our approach to semantic indexing and content-based copy detection which was used for TRECVID 2010 evaluation.

Semantic indexing
1. The runs differ in the types of features used. All runs use several bag-of-word representations fed to separate linear SVMs and the SVMs were fused by logistic regression. Visual and audio features were used as well as metadata. We added contextual features extracted from the video from which a shot originated.
  • F_A_brno.run1 (run1) - Only visual information. Dense sampling and Harris-Laplace detector with SIFT and RGB-SIFT descriptors
  • F_A_brno.run1 (run2) - The same as in run1 with added features from audio and metadata.
  • F_A_brno.run3 (run3) - The same as in run2 with added contextual features extracted from the whole video.

2. Audio and metadata significantly improves results. Even grater improvement was achieved by using the contextual features.

 

Content-based Copy Detection
1. One run submitted in two versions (the difference is only in relevance threshold setting)
  • brnoccd: SIFT and SURF combination, bag-of-words (visual codebook: 100k size, 4 nearest neighbors used in soft-assignment), inverted file index, geometry (homography) based image similarity metric

2. What if any significant differences (in terms of what measures) did you find among the runs?

  • only one setting used - no differences

3. Based on the results, can you estimate the relative contribution of each component of your system/approach to its effectiveness?

  • slow search in reference dataset due to pure indexing effectiveness

4. Overall, what did you learn about runs/approaches and the research question(s) that motivated them?

  • change the way of describing the video content - frame based (or key-frame based) approach is not sufficient
BibTeX:
@INPROCEEDINGS{
   author = {V{\'{i}}t{\v{e}}zslav Beran and Michal
	Hradi{\v{s}} and Lubom{\'{i}}r Otrusina and Ivo
	{\v{R}}ezn{\'{i}}{\v{c}}ek},
   title = {Brno University of Technology at TRECVid 2011},
   pages = {10},
   booktitle = {TRECVID 2011: Participant Notebook Papers and Slides},
   year = {2011},
   location = {Gaithersburg, MD, US},
   publisher = {National Institute of Standards and Technology},
   language = {english},
   url = {http://www.fit.vutbr.cz/research/view_pub.php.cs?id=9841}
}

Vaše IPv4 adresa: 18.232.147.215
Přepnout na https