Jiafei Duan (@DJiafei)
2026-02-05 | โค๏ธ 129 | ๐ 21 | ๐ฌ 2
Why do generalist robotic models fail when a cup is moved just two inches to the left? Itโs not a lack of motor skill, itโs an alignment problem. Today, we introduce VLS: Vision-Language Steering of Pretrained Robot Policies, a training-free framework that guides robot behavior in real time.
Check out the project: https://vision-language-steering.github.io/webpage/ ๐๐งต (Watch till the end: VLS runs uncut, steering pretrained policies across long-horizon tasks.)
๐ ์๋ฌธ ๋ด์ฉ
VLS: Steering Pretrained Robot Policies via VisionโLanguage Models
VLS: Steering Pretrained Robot Policies via Vision-Language Models - A training-free framework for inference-time adaptation of frozen generative robot policies.
๋ฏธ๋์ด
๐ฌ ์์
๐ Related
- dynamicvla โ ๋๋ฉ์ธ: VLM, Robotics/Manipulation
- first-fully-open-action-reasoning-model-arm-can-think-in-3d- โ ๋๋ฉ์ธ: VLM, Robotics/Manipulation
- bringing-foundation-models-to-depth-sensing-defm-is-trained- โ ๋๋ฉ์ธ: Robotics/Manipulation
- can-we-bridge-the-sim-to-real-gap-in-complex-manipulation-wi-683188 โ ๋๋ฉ์ธ: Robotics/Manipulation
- lingbot-depth-masked-depth-modeling-for-spatial-perception-941785 โ ๋๋ฉ์ธ: Robotics/Manipulation