Jianyuan (@jianyuan_wang)

2025-03-17 | โค๏ธ 1355 | ๐Ÿ” 195


Introducing VGGT (CVPRโ€™25), a feedforward Transformer that directly infers all key 3D attributes from one, a few, or hundreds of images, in seconds! No expensive optimization needed, yet delivers SOTA results for:

โœ… Camera Pose Estimation โœ… Multi-view Depth Estimation โœ… Dense Point Cloud Reconstruction โœ… Point Tracking

Project Page: https://vgg-t.github.io/

Code & Weights: https://github.com/facebookresearch/vggt/

๐Ÿ”— ์›๋ณธ ๋งํฌ

๋ฏธ๋””์–ด

video


Auto-generated - needs manual review

Tags

domain-vision-3d domain-llm domain-dev-tools domain-visionos