Conference paper

BERAN Vítězslav, HRADIŠ Michal, OTRUSINA Lubomír and ŘEZNÍČEK Ivo. TRECVID 2011: Participant Notebook Papers and Slides. In: TRECVID 2011: Participant Notebook Papers and Slides. Gaithersburg, MD: National Institute of Standards and Technology, 2011, p. 10.
Publication language:english
Original title:Brno University of Technology at TRECVid 2011
Pages:10
Proceedings:TRECVID 2011: Participant Notebook Papers and Slides
Conference:2011 TRECVID Workshop
Place:Gaithersburg, MD, US
Year:2011
Publisher:National Institute of Standards and Technology
URL:http://www-nlpir.nist.gov/projects/tvpubs/tv11.papers/brno.pdf [PDF]
Files: 
+Type Name Title Size Last modified
iconBrno University of Technology at TRECVid 2011.pdfBrno University of Technology at TRECVid 2011671 KB2011-12-24 14:09:08
^ Select all
With selected:
Keywords
TRECVID, semantic indexing, Content-based Copy Detection, image classification
Annotation
This paper describes our approach to semantic indexing and content-based copy detection which was used for TRECVID 2010 evaluation.

Semantic indexing
1. The runs differ in the types of features used. All runs use several bag-of-word representations fed to separate linear SVMs and the SVMs were fused by logistic regression. Visual and audio features were used as well as metadata. We added contextual features extracted from the video from which a shot originated.
  • F_A_brno.run1 (run1) - Only visual information. Dense sampling and Harris-Laplace detector with SIFT and RGB-SIFT descriptors
  • F_A_brno.run1 (run2) - The same as in run1 with added features from audio and metadata.
  • F_A_brno.run3 (run3) - The same as in run2 with added contextual features extracted from the whole video.

2. Audio and metadata significantly improves results. Even grater improvement was achieved by using the contextual features.

 

Content-based Copy Detection
1. One run submitted in two versions (the difference is only in relevance threshold setting)
  • brnoccd: SIFT and SURF combination, bag-of-words (visual codebook: 100k size, 4 nearest neighbors used in soft-assignment), inverted file index, geometry (homography) based image similarity metric

2. What if any significant differences (in terms of what measures) did you find among the runs?

  • only one setting used - no differences

3. Based on the results, can you estimate the relative contribution of each component of your system/approach to its effectiveness?

  • slow search in reference dataset due to pure indexing effectiveness

4. Overall, what did you learn about runs/approaches and the research question(s) that motivated them?

  • change the way of describing the video content - frame based (or key-frame based) approach is not sufficient
BibTeX:
@INPROCEEDINGS{
   author = {V{\'{i}}t{\v{e}}zslav Beran and Michal Hradi{\v{s}} and
	Lubom{\'{i}}r Otrusina and Ivo {\v{R}}ezn{\'{i}}{\v{c}}ek},
   title = {Brno University of Technology at TRECVid 2011},
   pages = {10},
   booktitle = {TRECVID 2011: Participant Notebook Papers and Slides},
   year = {2011},
   location = {Gaithersburg, MD, US},
   publisher = {National Institute of Standards and Technology},
   language = {english},
   url = {http://www.fit.vutbr.cz/research/view_pub.php?id=9841}
}

Your IPv4 address: 54.211.225.175
Switch to IPv6 connection

DNSSEC [dnssec]