ๆœบๅ™จไน‹ๅฟƒ JIQIZHIXIN (@jiqizhixin)

2026-01-11 | โค๏ธ 51 | ๐Ÿ” 10


What if an AI could generate a 5-minute video from a single image, with perfect control and consistency?

Enter LongVie 2.

Itโ€™s a โ€œworld modelโ€ that builds long videos step-by-step. It uses multiple guidance signals (like sketches or text) to control the scene, and special training to keep quality and logic consistent over very long sequences.

It outperforms prior methods in controllability, visual quality, and temporal coherence for ultra-long video generation (3-5 mins), setting a new state-of-the-art.

LongVie 2: Multimodal Controllable Ultra-Long Video World Model

Paper: https://arxiv.org/pdf/2512.13604 Project: https://vchitect.github.io/LongVie2-project/ GitHub: https://github.com/Vchitect/LongVie

Our report: https://mp.weixin.qq.com/s/oMWv6P6mm21XMk9bpZtKXg

๐Ÿ“ฌ PapersAccepted by Jiqizhixin

๐Ÿ”— ์›๋ณธ ๋งํฌ

๋ฏธ๋””์–ด

image


Auto-generated - needs manual review

Tags

AI-ML GenAI VLM Dev-Tools