7 40 19

Simon Pagezy

pagezyhf

pagezyhf

AI & ML interests

Healthcare ML

Recent Activity

posted an update 4 days ago

We published https://huggingface.co/blog/deepseek-r1-aws! If you are using AWS, give a read. It is a running document to showcase how to deploy and fine-tune DeepSeek R1 models with Hugging Face on AWS. We're working hard to enable all the scenarios, whether you want to deploy to Inference Endpoints, Sagemaker or EC2; with GPUs or with Trainium & Inferentia. We have full support for the distilled models, DeepSeek-R1 support is coming soon!! I'll keep you posted. Cheers

updated a dataset 4 days ago

amazon-sagemaker/repository-metadata

updated a dataset 4 days ago

huggingface/documentation-images

View all activity

Articles

Organizations

pagezyhf's activity

posted an update 4 days ago

Post

1551

We published https://huggingface.co/blog/deepseek-r1-aws!

If you are using AWS, give a read. It is a running document to showcase how to deploy and fine-tune DeepSeek R1 models with Hugging Face on AWS.

We're working hard to enable all the scenarios, whether you want to deploy to Inference Endpoints, Sagemaker or EC2; with GPUs or with Trainium & Inferentia.

We have full support for the distilled models, DeepSeek-R1 support is coming soon!! I'll keep you posted.

Cheers

1 reply

updated 2 datasets 4 days ago

amazon-sagemaker/repository-metadata

Preview • Updated 2 days ago • 116 • 1

huggingface/documentation-images

Viewer • Updated about 2 hours ago • 50 • 3.17M • 47

reacted to m-ric's post with 🚀 5 days ago

Post

3345

𝗧𝗵𝗲 𝗛𝘂𝗯 𝘄𝗲𝗹𝗰𝗼𝗺𝗲𝘀 𝗲𝘅𝘁𝗲𝗿𝗻𝗮𝗹 𝗶𝗻𝗳𝗲𝗿𝗲𝗻𝗰𝗲 𝗽𝗿𝗼𝘃𝗶𝗱𝗲𝗿𝘀!

✅ Hosting our own inference was not enough: now the Hub 4 new inference providers: fal, Replicate, SambaNova Systems, & Together AI.

Check model cards on the Hub: you can now, in 1 click, use inference from various providers (cf video demo)

Their inference can also be used through our Inference API client. There, you can use either your custom provider key, or your HF token, then billing will be handled directly on your HF account, as a way to centralize all expenses.

💸 Also, PRO users get 2$ inference credits per month!

Read more in the announcement 👉 https://huggingface.co/blog/inference-providers

1 reply

New activity in deepseek-ai/DeepSeek-R1 5 days ago

problem with using serverless inference

#78 opened 5 days ago by

manju2345

New activity in amazon-sagemaker/repository-metadata 5 days ago

Update modal.json

#29 opened 5 days ago by

pagezyhf

upvoted an article 6 days ago

Article

Welcome to Inference Providers on the Hub 🔥

7 days ago

• 214

New activity in deepseek-ai/DeepSeek-R1-Distill-Llama-70B 6 days ago

Amazon Sagemaker deployment failing with CUDA OutOfMemory error

#10 opened 7 days ago by

neelkapadia

New activity in Qwen/Qwen2-VL-7B-Instruct 7 days ago

Anyone able to deploy an inference endpoint on sagemaker?

#71 opened 24 days ago by

TeoGX

reacted to merve's post with 👍 7 days ago

Post

4672

Oof, what a week! 🥵 So many things have happened, let's recap! merve/jan-24-releases-6793d610774073328eac67a9

Multimodal 💬
- We have released SmolVLM -- tiniest VLMs that come in 256M and 500M, with it's retrieval models ColSmol for multimodal RAG 💗
- UI-TARS are new models by ByteDance to unlock agentic GUI control 🤯 in 2B, 7B and 72B
- Alibaba DAMO lab released VideoLlama3, new video LMs that come in 2B and 7B
- MiniMaxAI released Minimax-VL-01, where decoder is based on MiniMax-Text-01 456B MoE model with long context
- Dataset: Yale released a new benchmark called MMVU
- Dataset: CAIS released Humanity's Last Exam (HLE) a new challenging MM benchmark

LLMs 📖
- DeepSeek-R1 & DeepSeek-R1-Zero: gigantic 660B reasoning models by DeepSeek, and six distilled dense models, on par with o1 with MIT license! 🤯
- Qwen2.5-Math-PRM: new math models by Qwen in 7B and 72B
- NVIDIA released AceMath and AceInstruct, new family of models and their datasets (SFT and reward ones too!)

Audio 🗣️
- Llasa is a new speech synthesis model based on Llama that comes in 1B,3B, and 8B
- TangoFlux is a new audio generation model trained from scratch and aligned with CRPO

Image/Video/3D Generation ⏯️
- Flex.1-alpha is a new 8B pre-trained diffusion model by ostris similar to Flux
- tencent released Hunyuan3D-2, new 3D asset generation from images

7 replies

upvoted an article 11 days ago

Article

Mastering Long Contexts in LLMs with KVPress

•

11 days ago

• 57

liked a model 11 days ago

deepseek-ai/DeepSeek-R1

Text Generation • Updated 2 days ago • 953k • • 6.2k

liked 2 models 12 days ago

deepseek-ai/DeepSeek-R1-Distill-Qwen-32B

Text Generation • Updated 2 days ago • 319k • • 824

deepseek-ai/DeepSeek-R1-Zero

Text Generation • Updated 2 days ago • 20.8k • 676

upvoted a collection 12 days ago

DeepSeek-R1

Collection

8 items • Updated 14 days ago • 356

reacted to burtenshaw's post with 🔥 15 days ago

Post

40419

We’re launching a FREE and CERTIFIED course on Agents!

We're thrilled to announce the launch of the Hugging Face Agents course on Learn! This interactive, certified course will guide you through building and deploying your own AI agents.

Here's what you'll learn:

- Understanding Agents: We'll break down the fundamentals of AI agents, showing you how they use LLMs to perceive their environment (observations), reason about it (thoughts), and take actions. Think of a smart assistant that can book appointments, answer emails, or even write code based on your instructions.
- Building with Frameworks: You'll dive into popular agent frameworks like LangChain, LlamaIndex and smolagents. These tools provide the building blocks for creating complex agent behaviors.
- Real-World Applications: See how agents are used in practice, from automating SQL queries to generating code and summarizing complex documents.
- Certification: Earn a certification by completing the course modules, implementing a use case, and passing a benchmark assessment. This proves your skills in building and deploying AI agents.
Audience

This course is designed for anyone interested in the future of AI. Whether you're a developer, data scientist, or simply curious about AI, this course will equip you with the knowledge and skills to build your own intelligent agents.

Enroll today and start building the next generation of AI agent applications!

https://bit.ly/hf-learn-agents

23 replies

reacted to mlabonne's post with 🤗 15 days ago

Post

3798

🆕 LLM Course 2025 edition!

I updated the LLM Scientist roadmap and added a ton of new information and references. It covers training, datasets, evaluation, quantization, and new trends like test-time compute scaling.

The LLM Course has been incredibly popular (41.3k stars!) and I've been touched to receive many, many messages about how it helped people in their careers.

I know how difficult this stuff can be, so I'm super proud of the impact it had. I want to keep updating it in 2025, especially with the LLM Engineer roadmap.

Thanks everyone, hope you'll enjoy it!

💻 LLM Course: https://huggingface.co/blog/mlabonne/llm-course

reacted to merve's post with ❤️ 17 days ago

Post

2545

Everything that happened this week in open AI, a recap 🤠 merve/jan-17-releases-678a673a9de4a4675f215bf5

👀 Multimodal
- MiniCPM-o 2.6 is a new sota any-to-any model by OpenBMB
(vision, speech and text!)
- VideoChat-Flash-Qwen2.5-2B is new video multimodal models by OpenGVLab that come in sizes 2B & 7B in resolutions 224 & 448
- ByteDance released larger SA2VA that comes in 26B parameters
- Dataset: VRC-Bench is a new diverse benchmark for multimodal LLM reasoning performance

💬 LLMs
- MiniMax-Text-01 is a new huge language model (456B passive 45.9B active params) by MiniMaxAI with context length of 4M tokens 🤯
- Dataset: Sky-T1-data-17k is a diverse dataset used to train Sky-T1-32B
- kyutai released Helium-1-Preview-2B is a new small multilingual LM
- Wayfarer-12B is a new LLM able to write D&D 🧙🏻‍♂️
- ReaderLM-v2 is a new HTML parsing model by Jina AI

- Dria released, Dria-Agent-a-3B, new agentic coding model (Pythonic function calling) based on Qwen2.5 Coder
- Unsloth released Phi-4, faster and memory efficient Llama 3.3

🖼️ Vision
- MatchAnything is a new foundation model for matching
- FitDit is a high-fidelity VTON model based on DiT architecture

🗣️ Audio
- OuteTTS-0.3-1B is a new multilingual text-to-speech model with voice cloning and emotion control capabilities

📖 Retrieval
- lightblue released a new reranker based on Qwen2.5 LB-reranker-0.5B-v1.0 that can handle 95+ languages
- cde-small-v2 is a new sota small retrieval model by
@jxm

upvoted an article 18 days ago

Article

Introducing multi-backends (TRT-LLM, vLLM) support for Text Generation Inference

19 days ago

• 63

posted an update 21 days ago

Post

426

Learn how to deploy multiple LoRA adapters on Vertex AI with this blogpost, using Hugging Face Deep Learning Containers on GCP.

https://medium.com/google-cloud/open-models-on-vertex-ai-with-hugging-face-serving-multiple-lora-adapters-on-vertex-ai-e3ceae7b717c

Simon Pagezy

AI & ML interests

Recent Activity

Articles

How to deploy and fine-tune DeepSeek models on AWS

Hugging Face models in Amazon Bedrock

Introducing HUGS - Scale your AI with Open Models

Deploy Meta Llama 3.1 405B on Google Cloud Vertex AI

Google Cloud TPUs made available to Hugging Face users

Introducing Spaces Dev Mode for a seamless developer experience

Organizations

pagezyhf's activity

problem with using serverless inference

Update modal.json

Welcome to Inference Providers on the Hub 🔥

Amazon Sagemaker deployment failing with CUDA OutOfMemory error

Anyone able to deploy an inference endpoint on sagemaker?

Mastering Long Contexts in LLMs with KVPress

Introducing multi-backends (TRT-LLM, vLLM) support for Text Generation Inference