Publication Details

Camera Orientation Estimation in Natural Scenes Using Semantic Cues

BREJCHA Jan and ČADÍK Martin. Camera Orientation Estimation in Natural Scenes Using Semantic Cues. In: 2018 International Conference on 3D Vision. Verona: IEEE Computer Society, 2018, pp. 208-217. ISBN 978-1-5386-2610-8.
Czech title
Odhad orientace kamery v přírodních scénách s využitím sémantické segmentace
Type
conference paper
Language
english
Authors
URL
Keywords

camera orientation estimation, camera calibration, semantic segmentation, digital elevation model of a terrain, OpenStreetMap, geo-localization, computer vision, computer graphics

Abstract

Camera orientation estimation in natural scenes has recently been approached by several methods, which rely mainly on matching a single modality - edges or horizon lines with 3D digital elevation models. In contrast to previous works, our new image to model matching scheme is based on a fusion of multiple modalities and is designed to be naturally extensible with different cues. In this paper, we use semantic segments and edges. To our knowledge, we are the first to consider using semantic segments jointly with edges for alignment with digital elevation model. We show that high-level features, such as semantic segments, complement the low-level edge information and together help to estimate the camera orientation more robustly compared to methods relying solely on edges or horizon lines. In a series of experiments, we show that segment boundaries tend to be imprecise and important information for matching is encoded in the segment area and a coarse shape. Intuitively, semantic segments encode low frequency information as opposed to edges, which encode high frequencies. Our experiments exhibit that semantic segments and edges are complementary, improving camera orientation estimation reliability when used together. We demonstrate that our method combining semantic and edge features is able to reach state-of-the-art performance on three datasets.

Annotation

We introduce a novel method for camera orientation estimation with the assumption that camera position and field-of-view are known. Our method includes following steps: (I) from an input image edges and semantic segments describing semantic areas, such as forests, bodies of water, sky or glaciers are detected; (II) with the knowledge of camera position a synthetic 360-degree panorama containing semantic segments from OpenStreetMap database and mountain silhouettes from digital elevation model is rendered; (III) detected and rendered semantic segments and edges are matched with each other using our novel Confidence Fusion (CF) framework to estimate the camera orientation. The proposed CF framework is based on a spherical cross-correlation, which is computed for each semantic class and edge layer separately to form per layer confidences. The confidences are fused into a final result using a weighted geometric average with custom weights. 

Our method is suitable for camera orientation estimation in outdoor environments and can be used in various applications. The applications include virtual and augmented reality, navigation, and visualizations including historical re-photography. We provide extensive experiments and ablation studies to show that our method is able to reach state-of-the-art performance on three publicly available datasets.

This publication has been presented on International Conference on 3D Vision 2018 in Verona, Italy, and is available online: https://ieeexplore.ieee.org/document/8490971. Supplementary materials and a video summarizing our paper are available on the project webpage: http://cphoto.fit.vutbr.cz/semantic-orientation/.

Published
2018
Pages
208-217
Proceedings
2018 International Conference on 3D Vision
Conference
International Conference on 3D Vision 2018, Verona, IT
ISBN
978-1-5386-2610-8
Publisher
IEEE Computer Society
Place
Verona, IT
DOI
UT WoS
000449774200022
EID Scopus
BibTeX
@INPROCEEDINGS{FITPUB11829,
   author = "Jan Brejcha and Martin \v{C}ad\'{i}k",
   title = "Camera Orientation Estimation in Natural Scenes Using Semantic Cues",
   pages = "208--217",
   booktitle = "2018 International Conference on 3D Vision",
   year = 2018,
   location = "Verona, IT",
   publisher = "IEEE Computer Society",
   ISBN = "978-1-5386-2610-8",
   doi = "10.1109/3DV.2018.00033",
   language = "english",
   url = "https://www.fit.vut.cz/research/publication/11829"
}
Back to top