Abstract
Recovering 3D structure and camera motion from images has been a long-standing focus of computer vision research and is known as Structure-from-Motion (SfM). Solutions to this problem are categorized into incremental and global approaches. Until now, the most popular systems follow the incremental paradigm due to its superior accuracy and robustness, while global approaches are drastically more scalable and efficient. With this work, we revisit the problem of global SfM and propose GLOMAP as a new general-purpose system that outperforms the state of the art in global SfM. In terms of accuracy and robustness, we achieve results on-par or superior to COLMAP, the most widely used incremental SfM, while being orders of magnitude faster. We share our system as an open-source implementation at here.
Video
Pipeline
The pipeline consists of two major components: correspondence search and global estimation.
The major differences between GLOMAP and other bases are three steps: view graph calibration, global positioning and structure refinement.
Global Positioning
The core step of the software is Global Positioning. In this step, camera position and image points are jointly estimated from random positional.
Gallery
Reconstruction of LaMAR LIN
LIN from LaMAR is a large scale dataset with more than 36k images and with camera positions span more than 250m. GLOMAP achieves 90% recall rate at 1m and is constructed within 5.5 hours (compared to ~50% recall rate and over 7 days for COLMAP).
Convergence of global positioning
Visualization of the Global Positioning optimization. Each change in the video is a step in the optimization. The optimization generally converges in about 50-70 steps.
Instant-NGP Reconstructions
OpenMVG |
Theia |
COLMAP |
GLOMAP |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
BibTeX
@inproceedings{pan2024glomap,
author={Pan, Linfei and Baráth, Dániel and Pollefeys, Marc and Sch\"{o}nberger, Johannes Lutz},
title={Global Structure-from-Motion Revisited},
booktitle={European Conference on Computer Vision (ECCV)},
year={2024},
}