Conference paper

KOLÁŘ Martin, HRADIŠ Michal and ZEMČÍK Pavel. Deep Learning on Small Datasets using Online Image Search. In: Proceedings of 32nd Spring Conference on Computer Graphics. Bratislava: Comenius University in Bratislava, 2016, pp. 1-7. ISBN 978-1-4503-3693-2. ISSN 1335-5694. Available from:
Publication language:english
Original title:Deep Learning on Small Datasets using Online Image Search
Title (cs):Hluboké Učení na Malých Datasetech s použitím Online Obrazového Vyhledávání
Proceedings:Proceedings of 32nd Spring Conference on Computer Graphics
Conference:Spring Conference on Computer Graphics 2016
Place:Bratislava, SK
Journal:Proceeding of Spring Conference on Computer Graphics, Vol. 2016, No. 32, Bratislava, SK
Publisher:Comenius University in Bratislava
+Type Name Title +Size Last modified
iconSCCG_camera_ready_Kolar_ammended.pdf893 KB2016-04-04 14:11:02
^ Select all
With selected:
convolutional neural network, deep learning, image classification, reinforcement learning
Our contribution has the ability to learn visual categories from fewer images than previous approaches. We do this by modifying the pseudolabel method which augments labelled training images with unlabelled images, to create a method capable of handling labelled training images as well as queried images, which are likely to belong to the desired class. This is achieved by modifying the weighting and selection processes.
The presented method adapts the pseudolabel approach to allow the use of web-scale datasets of millions of images. The results are demonstrated on a toy problem&start=0&order=1 devised from the SUN 397 dataset, and on the full SUN 397 dataset expanded with images gathered from Google’s image search without human intervention.
This paper tackles the important unsolved problem of training deep models with small amounts of annotated data. We propose a
semi-supervised self-training bootstrap to deep learning which retrieves and utilizes additional images from internet image search.
We adapt the pseudolabel method proposed by Dong-Hyun Lee in 2013, previously used on the elementary MNIST handwritten
digit classification task. We show that by suitable modifications to its example weighting and selection mechanisms it can be adapted
to general image classification tasks supported by online image search.
The proposed approach does not require any human supervision, it is practical and efficient, and it actively avoids overtraining.
The usefulness of the proposed method is demonstrated on the SUN 397 dataset with only 50 training images per category. When
exploiting results of Google's Image Search, we achieve a significant improvement, with a classification accuracy of 51%, as
opposed to 39% without our method.
   author = {Martin Kol{\'{a}}{\v{r}} and Michal Hradi{\v{s}}
	and Pavel Zem{\v{c}}{\'{i}}k},
   title = {Deep Learning on Small Datasets using Online Image
   pages = {1--7},
   booktitle = {Proceedings of 32nd Spring Conference on Computer Graphics},
   journal = {Proceeding of Spring Conference on Computer Graphics},
   volume = 2016,
 number = 32,
   year = 2016,
   location = {Bratislava, SK},
   publisher = {Comenius University in Bratislava},
   ISBN = {978-1-4503-3693-2},
   ISSN = {1335-5694},
   doi = {10.1145/2948628.2948633},
   language = {english},
   url = {}

Your IPv4 address:
Switch to https