Ilir Aliu (@IlirAliu_)

2026-01-30 | โค๏ธ 206 | ๐Ÿ” 30 | ๐Ÿ’ฌ 2


First fully open Action Reasoning Model (ARM); can โ€˜thinkโ€™ in 3D & turn your instructions into real-world actions:

[๐Ÿ“ Bookmark for later]

A model that reasons in space, time, and motion.

It breaks down your command into three steps:

Grounds the scene with depth-aware perception tokens Plans the motion through visual reasoning traces Executes low-level commands for real hardware

Think of it as chain-of-thought for physical action.

Give it an instruction like โ€œPick up the trashโ€ and MolmoAct will:

  1. Understand the environment through depth perception
  2. Visually plan the sequence of moves
  3. Carry them outโ€ฆ while letting you see the plan overlaid on camera frames before anything moves

Itโ€™s steerable in real time: draw a path, change the prompt, and the trajectory updates instantly.

AAAANNNDDD: Itโ€™s completely open: checkpoints, code, and evaluation scripts are ALL PUBLIC!

Resources: Models: https://huggingface.co/collections/allenai/molmoact Data: https://huggingface.co/collections/allenai/molmoact-data-mixture ๐Ÿ“Blog: https://allenai.org/blog/molmoact

MolmoAct runs across different robot types (from gripper arms to humanoids) and adapts quickly to new tasks.

It outperforms models from major labs like NVIDIA, Google, and Microsoft on benchmark tests for generalization and real-world success rates.

For anyone building robotics systems or studying AI-driven action models, this is worth exploringโ€ฆ and worth sharing! โ™ป๏ธ


๐Ÿ“„ ์›๋ฌธ ๋‚ด์šฉ

MolmoAct - a allenai Collection

All models for the MolmoAct (Multimodal Open Language Model for Action) release.

MolmoAct Data Mixture - a allenai Collection

All datasets for the MolmoAct (Multimodal Open Language Model for Action) release.

MolmoAct: An Action Reasoning Model that reasons in 3D space | Ai2

MolmoAct is the first model able to โ€œthinkโ€ in three dimensions, trained efficiently and delivering benchmark-topping performance.


๋ฏธ๋””์–ด

๐ŸŽฌ ์˜์ƒ


Tags

AI-ML 3D-Vision Robotics