Zoubin Ghahramani (@ZoubinGhahrama1)
2026-01-22 | โค๏ธ 109 | ๐ 5 | ๐ฌ 3
Exciting new work on detailed (pixel-level, dense) 3D visual understanding of videos. Based on a scalable feedforward architecture, itโs super fast and super accurate (SOTA). Lots of uses in robotics, AR, world modellingโฆ Check it out!
์ธ์ฉ๋ ํธ์
@GoogleDeepMind: Weโre helping AI to see the 3D world in motion as humans do. ๐
Enter D4RT: a unified model that turns video into 4D representations faster than previous methods - enabling it to understand space and โฆ
๐ Related
- what-if-sim-and-reality-were-one-this-system-keeps-them-in โ ์ฃผ์ : AI-ML, Robotics, Vision/3D
- do-we-really-need-an-external-world-model-standard โ ์ฃผ์ : AI-ML, Robotics, GenAI
- 16-ego-centric-world-models-we-introduce-egowm-a-video โ ์ฃผ์ : AI-ML, Robotics, GenAI
- video-models-serve-as-a-good-pretrained-backbone-for-robot โ ์ฃผ์ : AI-ML, Robotics, GenAI
- what-if-we-could-train-ai-robots-in-a-perfect-physics โ ์ฃผ์ : AI-ML, Robotics, GenAI
์ธ์ฉ ํธ์
Google DeepMind (@GoogleDeepMind)
Weโre helping AI to see the 3D world in motion as humans do. ๐
Enter D4RT: a unified model that turns video into 4D representations faster than previous methods - enabling it to understand space and time. This is how it works ๐งต
๐ฌ ์์