Abstract

Recovering 3D structure and camera motion from images has been a long-standing focus of computer vision research and is known as Structure-from-Motion (SfM). Solutions to this problem are categorized into incremental and global approaches. Until now, the most popular systems follow the incremental paradigm due to its superior accuracy and robustness, while global approaches are drastically more scalable and efficient. With this work, we revisit the problem of global SfM and propose GLOMAP as a new general-purpose system that outperforms the state of the art in global SfM. In terms of accuracy and robustness, we achieve results on-par or superior to COLMAP, the most widely used incremental SfM, while being orders of magnitude faster. We share our system as an open-source implementation at here.

Video

Pipeline

The pipeline consists of two major components: correspondence search and global estimation. The major differences between GLOMAP and other bases are three steps: view graph calibration, global positioning and structure refinement.

Pipeline

Global Positioning

The core step of the software is Global Positioning. In this step, camera position and image points are jointly estimated from random positional.

Global positioning

Reconstruction of LaMAR LIN

LIN from LaMAR is a large scale dataset with more than 36k images and with camera positions span more than 250m. GLOMAP achieves 90% recall rate at 1m and is constructed within 5.5 hours (compared to ~50% recall rate and over 7 days for COLMAP).

Theia COLMAP GLOMAP

Convergence of global positioning

Visualization of the Global Positioning optimization. Each change in the video is a step in the optimization. The optimization generally converges in about 50-70 steps.

...
...
...
...
...

Instant-NGP Reconstructions

OpenMVG Theia COLMAP GLOMAP

BibTeX

@inproceedings{pan2024glomap,
    author={Pan, Linfei and Baráth, Dániel and Pollefeys, Marc and Sch\"{o}nberger, Johannes Lutz},
    title={Global Structure-from-Motion Revisited},
    booktitle={European Conference on Computer Vision (ECCV)},
    year={2024},
}