Original Tweet
The next evolution: VLA+ models
Just yesterday @MSFTResearch released Rho-alpha (ฯฮฑ) โ their first robotics model, built on the Phi family.
While most Vision-Language-Action (VLA) models stop at vision and language, Rho-alpha adds:
โช๏ธ Tactile sensing to feel objects during manipulation โช๏ธ Online learning that lets it improve from human corrections (via teleoperation, 3D mouse or other tools) in real-time even after deployment.
Both these sides make adaptability central rather than incidental. Microsoft calls it a VLA+ model, positioning it as an extension beyond what current VLA systems support.
โก๏ธ Today Rho-alpha can control dual-arm robot setups to perform tasks such as:
โข Manipulating the BusyBox following natural-language instructions โข Plug insertion โข Toolbox packing and object arrangement with bimanual coordination
But to understand why this โplusโ matters, we need to understand what came before. Here, weโll take you through the entire landscape of VLA models โ Gemini Robotics, ฯ0, SmolVLA, Helix, ACoT-VLA and others: https://www.turingpost.com/p/vlaplus
๋ฏธ๋์ด
![]()
๐ Related
- dynamicvla โ ์ฃผ์ : Vla, Robotics Manipulation
- first-fully-open-action-reasoning-model-arm-can-think-in-3d- โ ์ฃผ์ : Robotics Manipulation
- first-fully-open-action-reasoning-model-arm-can-think-in-3d-turn-your-instructio โ ์ฃผ์ : Robotics Manipulation
- mamba-policy-towards-efficient-3d-diffusion-policy-with-hybrid-selective-state-m โ ์ฃผ์ : Robotics Manipulation
- what-if-your-robot-or-car-could-see-depth-more-clearly-than- โ ์ฃผ์ : Vla