view post Post 2356 UI-TARS 🔥 series of native GUI agent models (2B/7B/72B) released by ByteDance, combining perception, reasoning, grounding, and memory into one system. Model: https://huggingface.co/bytedance-researchPaper: UI-TARS: Pioneering Automated GUI Interaction with Native Agents (2501.12326) See translation 😎 6 6 🧠 1 1 🚀 1 1 🔥 1 1 + Reply
mlx-community/Llama-3.2-11B-Vision-Instruct-abliterated-4-bit Image-Text-to-Text • Updated Dec 16, 2024 • 330 • 1
mlx-community/Qwen2.5-Coder-32B-Instruct-abliterated-4bit Text Generation • Updated 19 days ago • 69 • 1
Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces Paper • 2412.14171 • Published Dec 18, 2024 • 24