Bilawal Sidhu (@bilawalsidhu)

2025-02-27 | ❤️ 7108 | 🔁 835

Wow. Recreating the Shawshank Redemption prison in 3D from a single video, in real time (!)

Just read the MASt3R-SLAM paper and it’s pretty neat. These folks basically built a real-time dense SLAM system on top of MASt3R, which is a transformer-based neural network that can do 3d reconstruction and localization from uncalibrated image pairs.

The cool part is they don’t need a fixed camera model — it just works with arbitrary cameras — think different focal lengths, sensor sizes, even handling zooming in video (FMV drone video anyone?!). If you’ve done photogrammetry or played with NeRFs you know that is a HUGE deal.

They’ve solved some tricky problems like efficient point matching and tracking, plus they’ve figured out how to fuse point clouds and handle loop closures in real-time.

Their system runs at about 15 FPS on a 4090 and produces both camera poses and dense geometry. When they know the camera calibration, they get SOTA results across several benchmarks, but even without calibration, they still perform well.

What’s interesting is the approach — most recent SLAM work has built on DROID-SLAM’s architecture, but these folks went a different direction by leveraging a strong 3D reconstruction prior. Seems to give them more coherent geometry, which makes sense since that’s what MASt3R was designed for.

For anyone who cares about monocular SLAM and 3D reconstruction, this feels like a significant step toward plug-and-play dense SLAM without calibration headaches — perfect for drones, robots, AR/VR — the works!

미디어

video

Auto-generated - needs manual review

📚 세현's Vault

🌍 도메인

📄 Papers

wow

Bilawal Sidhu (@bilawalsidhu)

미디어

Tags

그래프 뷰

목차

백링크

📚 세현's Vault

🌍 도메인

📄 Papers

wow

Bilawal Sidhu (@bilawalsidhu)

미디어

🔗 Related

Tags

그래프 뷰

목차

백링크