Článek ve sborníku konference | |
| Hradiš, M., Beran, V., Řezníček, I., Herout, A., Bařina, D., Vlček, A., Zemčík, P.: Brno University of Technology at TRECVid 2010, In: TRECVID 2010: Participant Notebook Papers and Slides, Gaithersburg, MD, US, NIST, 2010, s. 11 | | Jazyk publikace: | angličtina |
|---|
| Název publikace: | Brno University of Technology at TRECVid 2010 |
|---|
| Název (cs): | Brno University of Technology at TRECVid 2010 |
|---|
| Strany: | 11 |
|---|
| Sborník: | TRECVID 2010: Participant Notebook Papers and Slides |
|---|
| Konference: | 2010 TRECVID Workshop |
|---|
| Místo vydání: | Gaithersburg, MD, US |
|---|
| Rok: | 2010 |
|---|
| Vydavatel: | National Institute of Standards and Technology |
|---|
| URL: | http://www-nlpir.nist.gov/projects/tvpubs/tv10.papers/brno.pdf [PDF] |
|---|
| Klíčová slova |
|---|
| TRECVID, semantic indexing, Content-based Copy Detection, image classification |
| Anotace |
|---|
This paper describes our approach to semantic indexing and content-based copy detection which was used for TRECVID 2010 evaluation.
Semantic indexing
1. The
runs differ in the types of visual features used. All runs use several
bag-of-word representations fed to separate linear SVMs and the SVMs were fused
by logistic regression. - F_A_Brno_resource_4: Only single best visual features (on the training
set) are used - dense image sampling with rgb-SIFT.
- F_A_Brno_basic_3: This run uses dense sampling and Harris-Laplace detector in
combination with SIFT and rgb-sift descriptors.
- F_A_Brno_color_2: This run extends F_A_Brno_basic_3 by adding dense
sampling with rg-SIFT, Opponent-SIFT, Hue-SIFT, HSV-SIFT, C-SIFT and opponent
histogram descriptors.
- F_A_Brno_spacetime_1: This run extends F_A_Brno_color_2 by adding space-time
visual features STIP and HESSTIP.
2. Combining multiple types of visual
features improves results significantly. F_A_Brno_color_2 achieve more than
twice better results than F_A_Brno_resource_4. The space-time visual features
did not improve results.
3. Combining multiple types of visual
features is important. Linear SVM is inferior to non-linear SVM in the context
of semantic indexing.
Content-based Copy Detection
1. Two runs submitted, but with similar settings; the difference is
only in amount of processed test data (40% and 60%)
- brno.m.*.l3sl2: SURF,
bag-of-words (visual codebook: 2k size, 4 nearest neighbors used in
soft-assignment), inverted file index, geometry (homography) based image
similarity metric
2. What if any significant differences (in terms of what measures) did
you find among the runs?
- only one setting used - no
differences
3. Based on the results, can you estimate the relative contribution of
each component of your system/approach to its effectiveness?
- slow search in reference
dataset due to unsuitable configuration of used visual codebook
4. Overall, what did you learn about runs/approaches and the research
question(s) that motivated them? - change the way of describing
the video content - frame based (or key-frame based) approach is not sufficient
|
| BibTeX: |
|---|
@INPROCEEDINGS{
author = {Michal Hradiš and Vítězslav Beran and Ivo Řezníček and Adam
Herout and David Bařina and Adam Vlček and Pavel Zemčík},
title = {Brno University of Technology at TRECVid 2010},
pages = {11},
booktitle = {TRECVID 2010: Participant Notebook Papers and Slides},
year = {2010},
location = {Gaithersburg, MD, US},
publisher = {National Institute of Standards and Technology},
language = {english},
url = {http://www.fit.vutbr.cz/research/view_pub.php?id=9444}
} |
|