ๆบๅจไนๅฟ JIQIZHIXIN (@jiqizhixin)
2026-01-11 | โค๏ธ 51 | ๐ 10
What if an AI could generate a 5-minute video from a single image, with perfect control and consistency?
Enter LongVie 2.
Itโs a โworld modelโ that builds long videos step-by-step. It uses multiple guidance signals (like sketches or text) to control the scene, and special training to keep quality and logic consistent over very long sequences.
It outperforms prior methods in controllability, visual quality, and temporal coherence for ultra-long video generation (3-5 mins), setting a new state-of-the-art.
LongVie 2: Multimodal Controllable Ultra-Long Video World Model
Paper: https://arxiv.org/pdf/2512.13604 Project: https://vchitect.github.io/LongVie2-project/ GitHub: https://github.com/Vchitect/LongVie
Our report: https://mp.weixin.qq.com/s/oMWv6P6mm21XMk9bpZtKXg
๐ฌ PapersAccepted by Jiqizhixin
๐ ์๋ณธ ๋งํฌ
- https://arxiv.org/pdf/2512.13604
- https://vchitect.github.io/LongVie2-project/
- https://github.com/Vchitect/LongVie
- https://mp.weixin.qq.com/s/oMWv6P6mm21XMk9bpZtKXg
๋ฏธ๋์ด

๐ Related
Auto-generated - needs manual review