Original Tweet


Do we REALLY need an external world model? ๐Ÿค”

Standard approaches often rely on heavy external simulators.

We agree with the view: The Agent itself is the World Model.

๐ŸŒ How to align agentic world models via experience learning?

We are excited to introduce our new work: โ€œAligning Agentic World Models via Knowledgeable Experience Learningโ€(WorldMind)๐Ÿš€

๐ŸšงThe Problem: LLMs possess vast semantic knowledge but lack physical grounding. โ†’ Ask for a plan: It sounds logical. โ†’ Execute it: It fails physically (e.g., trying to slice without a knife). ๐Ÿ˜ตโ€๐Ÿ’ซ

The agent knows what to do, but not how physical laws constrain it.

๐Ÿ’กThe Solution: WorldMind

We bridge the gap between high-level reasoning and physical reality through:

๐ŸŒ Agentic World Model: Instead of external engines, we activate the agentโ€™s internal ability to simulate environmental dynamics to guide planning.

๐Ÿ”น Online Experience Learning: Eliminates the need for costly fine-tuning or retraining. ๐Ÿ”น Alignment via World Knowledge: Autonomously builds a World Knowledge Repository (WKR) to ground the agent.

This unifies: โ€ข Process Experience: Learning from step-level prediction failures โŒ โ€ข Goal Experience: Distilling shortcuts from successful trajectories โœ…

๐Ÿš€ Key Features:

โœ… Training-Free: Aligns agents via online experience learning.

โœจ Superior Performance: improvements on EB-ALFRED & EB-Habitat.

๐Ÿ”— Project Page: https://zjunlp.github.io/WorldMind/ ๐Ÿ“„ Paper: https://huggingface.co/papers/2601.13247

Our current method is limited by todayโ€™s foundation models and cannot yet support reliable long-horizon planning.

Looking ahead, as model capacity and memory modules continue to improve, we believe agents will gradually internalize world models and achieve robust long-term embodied decision-making.

EmbodiedAI MultimodalAgent ExperienceLearning Alignment WorldModels LLM Robotics AgenticAI NLP WorldMind

๐Ÿ”— ์›๋ณธ ๋งํฌ

๋ฏธ๋””์–ด

image

image

image


Tags

3D-Vision Robotics AI-ML