📚 세현's Vault

🌍 도메인

🔮3D-Vision
🎨Rendering
🤖Robotics
🧠LLM
👁️VLM
🎬GenAI
🥽XR
🎮Simulation
🛠️Dev-Tools
💰Crypto
📈Finance
📋Productivity
📦기타

📄 Papers

📚전체 논문172

❯

❯

multimodal llms are just superb to play with hacking around with qwen2 vl and it

multimodal-llms-are-just-superb-to-play-with-hacking-around-with-qwen2-vl-and-it

2024년 9월 04일1 min read

GenAI
text-to-X

Rohan Paul (@rohanpaul_ai)

2024-09-04 | ❤️ 453 | 🔁 76

Multimodal LLMs are just superb to play with.

Hacking around with Qwen2 VL : and its great for

📷 OCR 📝 Image-to-markdown conversion 🏷️ Classification 🔍 Object tagging 🗝️ Keyword generation

Naive Dynamic Resolution and Multimodal Rotary Position Embedding (M-ROPE) is just WORKING

미디어

Tags

domain-ai-ml domain-genai domain-llm domain-vlm

그래프 뷰

Rohan Paul (@rohanpaul_ai)
미디어
Tags

백링크

domain-GenAI

Created with Quartz v4.5.2 © 2026

GitHub
Sehyeon Park