Alvaro Bartolome's picture

Alvaro Bartolome PRO

alvarobartt

AI & ML interests

machine learning @huggingface

Recent Activity

Organizations

Microsoft's profile picture Hugging Face's profile picture Spaces-explorers's profile picture Hackathon Somos NLP 2023: Los LLMs hablan Español's profile picture SomosNLP's profile picture Hugging Test Lab's profile picture Open-Source AI Meetup's profile picture Hugging Face H4's profile picture Argilla's profile picture Blog-explorers's profile picture ZeroGPU Explorers's profile picture gg-hf's profile picture Argilla Explorers's profile picture MLX Community's profile picture distilabel-internal-testing's profile picture ORPO Explorers's profile picture Data Is Better Together's profile picture Social Post Explorers's profile picture Hugging Face Discord Community's profile picture LLHF's profile picture SLLHF's profile picture Hugging Quants's profile picture blhf's profile picture Argilla Warehouse's profile picture nltpt's profile picture IOPO Experiments's profile picture Google Cloud 🤝🏻 Hugging Face's profile picture Huggingface HUGS's profile picture Data Is Better Together Contributor's profile picture AI Starter Pack's profile picture Open R1's profile picture gg-hf-g's profile picture Multimodal AI agents's profile picture

alvarobartt's activity

posted an update about 2 hours ago
view post
Post
143
🔥 Agents can do anything! @microsoft Research just announced the release of Magma 8B!

Magma is a new Visual Language Model (VLM) with 8B parameters for multi-modal agents designed to handle complex interactions across virtual and real environments; and it's MIT licensed!

Magma comes with exciting new features such as:
- Introduces the Set-of-Mark and Trace-of-Mark techniques for fine-tuning
- Leverages a large amount of unlabeled video data to learn the spatial-temporal grounding and planning
- A strong generalization and ability to be fine-tuned for other agentic tasks
- SOTA in different multi-modal benchmarks spanning across UI navigation, robotics manipulation, image / video understanding and spatial understanding and reasoning
- Generates goal-driven visual plans and actions for agentic use cases

Model: microsoft/Magma-8B
Technical Report: Magma: A Foundation Model for Multimodal AI Agents (2502.13130)
New activity in microsoft/Magma-8B about 4 hours ago

Fix typos and inference script

#3 opened about 4 hours ago by
alvarobartt
New activity in microsoft/Magma-8B about 9 hours ago
upvoted 2 articles 13 days ago
view article
Article

From Files to Chunks: Improving Hugging Face Storage Efficiency

51
view article
Article

From Chunks to Blocks: Accelerating Uploads and Downloads on the Hub

49
upvoted an article 18 days ago
upvoted an article 28 days ago
view article
Article

Welcome to Inference Providers on the Hub 🔥

387