Justine Moore (@venturetwins)
2025-12-28 | ❤️ 921 | 🔁 100
My favorite paper this year: “Video models are zero-shot learners and reasoners”
It illustrates that video models show emergent visual reasoning at scale - they can solve vision tasks they weren’t trained for.
This may be the “GPT moment” for vision. Let’s break it down 👇 https://x.com/venturetwins/status/2005330176977293743/video/1
🔗 원본 링크
미디어
![]()
🔗 Related
- objsplat-geometry-aware-gaussian-surfels-for-active-object
- gen3r-3d-scene-generation-meets-feed-forward-reconstruction
- gaussianfluent-explicit-gaussian-simulation-for-dynamic
- video-generation-but-4d-dynamic-scene-consistent-and-very
- gamo-geometry-aware-multi-view-diffusion-outpainting-for