Original Tweet
Video models serve as a good pretrained backbone for robot policies.
Paper: https://arxiv.org/abs/2601.16163 Code: https://github.com/nvlabs/cosmos-policy
๐ ์๋ณธ ๋งํฌ
๐ Related
- a-compact-04b-vision-language-action-model-that-finally-lets-robots-manipulate-m โ ์ฃผ์ : Vla
- dynamicvla โ ์ฃผ์ : Vla
- the-next-evolution-vla-models โ ์ฃผ์ : Vla
- what-if-your-robot-or-car-could-see-depth-more-clearly-than- โ ์ฃผ์ : Vla
- introducing-vla-scratch-a-modular-performant-and-efficient-stack-for-vlas-httpst โ ์ฃผ์ : Vla