Článek ve sborníku konference

HRADIŠ Michal, BERAN Vítězslav, ŘEZNÍČEK Ivo, HEROUT Adam, BAŘINA David, VLČEK Adam a ZEMČÍK Pavel. Brno University of Technology at TRECVid 2010. In: TRECVID 2010: Participant Notebook Papers and Slides. Gaithersburg, MD: National Institute of Standards and Technology, 2010, s. 11.
Jazyk publikace:angličtina
Název publikace:Brno University of Technology at TRECVid 2010
Název (cs):Brno University of Technology at TRECVid 2010
Strany:11
Sborník:TRECVID 2010: Participant Notebook Papers and Slides
Konference:2010 TRECVID Workshop
Místo vydání:Gaithersburg, MD, US
Rok:2010
Vydavatel:National Institute of Standards and Technology
URL:http://www-nlpir.nist.gov/projects/tvpubs/tv10.papers/brno.pdf [PDF]
Klíčová slova
TRECVID, semantic indexing, Content-based Copy Detection, image classification
Anotace
This paper describes our approach to semantic indexing and content-based copy detection which was used for TRECVID 2010 evaluation.

Semantic indexing

1.  The runs differ in the types of visual features used. All runs use several bag-of-word representations fed to separate linear SVMs and the SVMs were fused by logistic regression.

  • F_A_Brno_resource_4: Only single best visual features (on the training set) are used - dense image sampling with rgb-SIFT.
  • F_A_Brno_basic_3: This run uses dense sampling and Harris-Laplace detector in combination with SIFT and rgb-sift descriptors.
  • F_A_Brno_color_2: This run extends F_A_Brno_basic_3 by adding dense sampling with rg-SIFT, Opponent-SIFT, Hue-SIFT, HSV-SIFT, C-SIFT and opponent histogram descriptors.
  • F_A_Brno_spacetime_1: This run extends F_A_Brno_color_2 by adding space-time visual features STIP and HESSTIP.

2. Combining multiple types of visual features improves results significantly. F_A_Brno_color_2 achieve more than twice better results than F_A_Brno_resource_4. The space-time visual features did not improve results.

3. Combining multiple types of visual features is important. Linear SVM is inferior to non-linear SVM in the context of semantic indexing.

Content-based Copy Detection

1.    Two runs submitted, but with similar settings; the difference is only in amount of processed test data (40% and 60%)

  • brno.m.*.l3sl2: SURF, bag-of-words (visual codebook: 2k size, 4 nearest neighbors used in soft-assignment), inverted file index, geometry (homography) based image similarity metric

2.    What if any significant differences (in terms of what measures) did you find among the runs?

  • only one setting used - no differences

3.    Based on the results, can you estimate the relative contribution of each component of your system/approach to its effectiveness?

  • slow search in reference dataset due to unsuitable configuration of used visual codebook

4.    Overall, what did you learn about runs/approaches and the research question(s) that motivated them?

  • change the way of describing the video content - frame based (or key-frame based) approach is not sufficient
BibTeX:
@INPROCEEDINGS{
   author = {Michal Hradiš and Vítězslav Beran and Ivo Řezníček and Adam
	Herout and David Bařina and Adam Vlček and Pavel Zemčík},
   title = {Brno University of Technology at TRECVid 2010},
   pages = {11},
   booktitle = {TRECVID 2010: Participant Notebook Papers and Slides},
   year = {2010},
   location = {Gaithersburg, MD, US},
   publisher = {National Institute of Standards and Technology},
   language = {english},
   url = {http://www.fit.vutbr.cz/research/view_pub.php.cs?id=9444}
}

Vaše IPv4 adresa: 54.166.15.152
Přepnout na IPv6 spojení

DNSSEC [dnssec]