Tiancheng Zhao (Tony) (@tianchezhao)
2025-04-15 | ❤️ 146 | 🔁 38
Finally, our report of incentivizing reasoning in VLMs is out!
- Open-source VLM-R1 framework 🔥
- Reward engineering tricks that unlock emergent “aha!” moments💡
- Analysis of OOD generalization: RL vs SFT tradeoffs and many more📊
https://huggingface.co/papers/2504.07615
🔗 원본 링크
🔗 Related
Auto-generated - needs manual review