Ilir Aliu (@IlirAliu_)
2025-09-08 | โค๏ธ 330 | ๐ 52
Robots usually need tons of labeled data to learn precise actions.
What if they could learn control skills directly from human videosโฆ no labels needed?
Robotics pretraining just took a BIG jump forward.
A new Autoregressive Robotic Model, learns low-level 4D representations from human video data.
Bridging the gap between vision and real-world robotic control.
Why this matters: โ Pretraining with 4D geometry enables better transfer from human video to robot actions โ Overcomes the gap between high-level VLA pretraining and low-level robotic control โ Unlocks more accurate, data-efficient learning for real-world tasks
For more details, check out the paper: ๐https://arxiv.org/pdf/2502.13142
The team at @Berkeley AI Research will release the project page and code soon.
๋ฏธ๋์ด
![]()