Jie Wang (@JieWang_ZJUI)

2026-01-13 | โค๏ธ 355 | ๐Ÿ” 51


VLAs nowadays enable robotic manipulation to perform impressive tasks like folding clothes, making coffee, and cleaning dishes. However, surprisingly, most VLAs lack memory. Unlike their close relatives LLMs, VLAs have no context window and no access to history. This causes them to repeatedly fail in the same way without learning from online experience.

But why? Why not simply extend the context window like LLMs? Itโ€™s not that we donโ€™t want to โ€” itโ€™s because itโ€™s extremely difficult. Here, I share a talk by @chelseabfinn at NeurIPS that scope the challenges in developing long-horizon autonomy for embodied agents. At the end, thereโ€™s a reading list on memory for robotics. โญ

๋ฏธ๋””์–ด

image


Auto-generated - needs manual review

Tags

Robotics AI-ML LLM