Florent Daudens's picture

Florent Daudens

fdaudens

AI & ML interests

AI & Journalism

Recent Activity

liked a dataset 1 day ago
open-thoughts/OpenThoughts-114k
liked a Space 1 day ago
tencent/Hunyuan3D-2
updated a Space 2 days ago
zero-gpu-explorers/README
View all activity

Organizations

Hugging Face's profile picture Hugging Face OSS Metrics's profile picture Hugging Face TB Research's profile picture ZeroGPU Explorers's profile picture LeRobot's profile picture Journalists on Hugging Face's profile picture Major TOM's profile picture MLX Community's profile picture Social Post Explorers's profile picture Projet Spinoza's profile picture Dev Mode Explorers's profile picture Hugging Face for Legal's profile picture Hugging Face Discord Community's profile picture Big Science Social Impact Evaluation for Bias and Stereotypes's profile picture Dataset Tools's profile picture Hugging Face Science's profile picture Coordination Nationale pour l'IA's profile picture Data Is Better Together Contributor's profile picture Sandbox's profile picture Open R1's profile picture

fdaudens's activity

posted an update 3 days ago
reacted to merve's post with πŸ‘ 6 days ago
view post
Post
3746
This week in open AI was πŸ”₯ Let's recap! πŸ€— merve/january-31-releases-679a10669bd4030090c5de4d
LLMs πŸ’¬
> Huge: AllenAI released new TΓΌlu models that outperform DeepSeek R1 using Reinforcement Learning with Verifiable Reward (RLVR) based on Llama 3.1 405B πŸ”₯
> Mistral AI is back to open-source with their "small" 24B models (base & SFT), with Apache 2.0 license 😱
> Alibaba Qwen released their 1M context length models Qwen2.5-Instruct-1M, great for agentic use with Apache 2.0 license πŸ”₯
> Arcee AI released Virtuoso-medium, 32.8B LLMs distilled from DeepSeek V3 with dataset of 5B+ tokens
> Velvet-14B is a new family of 14B Italian LLMs trained on 10T tokens in six languages
> OpenThinker-7B is fine-tuned version of Qwen2.5-7B-Instruct on OpenThoughts dataset

VLMs & vision πŸ‘€
> Alibaba Qwen is back with Qwen2.5VL, amazing new capabilities ranging from agentic computer use to zero-shot localization πŸ”₯
> NVIDIA released new series of Eagle2 models with 1B and 9B sizes
> DeepSeek released Janus-Pro, new any-to-any model (image-text generation from image-text input) with MIT license
> BEN2 is a new background removal model with MIT license!

Audio πŸ—£οΈ
> YuE is a new open-source music generation foundation model, lyrics-to-song generation

Codebase πŸ‘©πŸ»β€πŸ’»
> We are open-sourcing our SmolVLM training and eval codebase! https://github.com/huggingface/smollm/tree/main/vision
> Open-R1 is open-source reproduction of R1 by @huggingface science team https://huggingface.co/blog/open-r1
  • 1 reply
Β·
posted an update 7 days ago
view post
Post
2330
πŸ“Š R1 just built its own download dashboard!

Some fresh stats: +6M downloads for 800+ derivative models vs 2M for originals. Watch the numbers grow here: fdaudens/deepseek-download-stats
posted an update 10 days ago
view post
Post
3268
🎯 Kokoro TTS just hit v1.0! πŸš€

Small but mighty: 82M parameters, runs locally, speaks multiple languages. The best part? It's Apache 2.0 licensed!
This could unlock so many possibilities ✨

Check it out: hexgrad/Kokoro-82M
  • 1 reply
Β·
reacted to hexgrad's post with πŸ”₯ 10 days ago
posted an update 11 days ago
view post
Post
1253
πŸ’ͺ The open-source community is really unstoppable:

+5M total downloads for DeepSeek models on @hf .co
+4M are from the 700 models created by the community
That's 30% more than yesterday!
posted an update 12 days ago
view post
Post
1682
πŸš€ The open source community is unstoppable: 4M total downloads for DeepSeek models on Hugging Face, with 3.2M coming from the +600 models created by the community.

That's 30% more than yesterday!
  • 1 reply
Β·
reacted to Kseniase's post with πŸš€ 13 days ago
view post
Post
3010
7 Open-source Methods to Improve Video Generation and Understanding

AI community is making great strides toward achieving the full potential of multimodality in video generation and understanding. Last week studies showed that working with videos is now one of the main focuses for improving AI models. Another highlight of the week is that open source, once again, proves its value. For those who were impressed by DeepSeek-R1, we’re with you!

Today, we’re combining these two key focuses and bringing you a list of open-source methods for better video generation and understanding:

1. VideoLLaMA 3 model: Excels in various video and image tasks thanks to vision-centric training approach. VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding (2501.13106)

2. FILMAGENT framework assigns roles to multiple AI agents, like a director, screenwriter, actor, and cinematographer, to automate the filmmaking process in 3D virtual environments. FilmAgent: A Multi-Agent Framework for End-to-End Film Automation in Virtual 3D Spaces (2501.12909)

3. Improving Video Generation with Human Feedback (2501.13918) proposes a new VideoReward Model and approach that uses human feedback to refine video generation models.

4. DiffuEraser video inpainting model, based on stable diffusion, is designed to fill in missing areas with detailed, realistic content and to ensure consistent structures across frames. DiffuEraser: A Diffusion Model for Video Inpainting (2501.10018)

5. MAGI is a hybrid video gen model that combines masked and casual modeling. Its key innovation, Complete Teacher Forcing (CTF), conditions masked frames on fully visible frames. Taming Teacher Forcing for Masked Autoregressive Video Generation (2501.12389)

6. Go-with-the-Flow: Motion-Controllable Video Diffusion Models Using Real-Time Warped Noise (2501.08331) proposes motion control, allowing users to guide how objects or the camera move in generated videos. Its noise warping algorithm replaces random noise in videos with structured noise based on motion info.

7. Video Depth Anything model estimates depth consistently in super-long videos (several minutes or more) without sacrificing quality or speed. Video Depth Anything: Consistent Depth Estimation for Super-Long Videos (2501.12375)
  • 1 reply
Β·
reacted to AdinaY's post with πŸš€ 13 days ago
view post
Post
2607
πŸ”₯So many exciting releases coming from the Chinese community this month!
zh-ai-community/2025-january-6786b054f492fb223591269e

LLMs:
✨ Qwen2.5 -1M by Alibaba
Qwen/qwen25-1m-679325716327ec07860530ba
✨ InternLM3-8B-Instruct by Shanghai AI Lab
internlm/internlm3-8b-instruct
✨ MiniMax-Text-01 by MiniMax AI
MiniMaxAI/MiniMax-Text-01
✨ RWKV-7 by BlinkDL -- RNN + Transformer πŸ‘€
BlinkDL/rwkv-7-world
✨ DeepSeek-R1 by DeepSeek -- THE ONE πŸ™Œ
https://huggingface.co/deepseek-ai
✨ Baichuan-M1-14B by Baichuan - Medical 🩺
baichuan-inc/Baichuan-M1-14B-Base
✨ Qwen2.5-Math-PRM by Alibaba - Math πŸ”’
Qwen/Qwen2.5-Math-PRM-7B

Code:
✨ Tare by Bytedance
https://trae.ai

TTS:
✨ T2A-01-HD by MiniMax AI
https://hailuo.ai/audio
✨ LLaSA by HKUST Audio
HKUSTAudio/Llasa-3B

MLLM:
✨ Kimi k1.5 by Moonshot AI
https://kimi.ai
✨ MiniCPM-o-2_6 by OpenBMB
openbmb/MiniCPM-o-2_6
✨ Sa2VA-4B by ByteDance
ByteDance/Sa2VA-4B
✨ VideoLLaMA 3 by Alibaba DAMO
DAMO-NLP-SG/videollama3-678cdda9281a0e32fe79af15
✨ LLaVA-Mini by Chinese Academy of Sciences
ICTNLP/llava-mini-llama-3.1-8b
✨Hunyuan-7B by Tencent
tencent/Hunyuan-7B-Instruct
✨ Hunyuan 3D 2.0 by Tencent
tencent/Hunyuan3D-2
✨MiniMax-VL-01 by MiniMax AI - A non transformer based VLM πŸ‘€
MiniMaxAI/MiniMax-VL-01

Agent:
✨ UI-TARS by Bytedance
bytedance-research/UI-TARS-7B-SFT
✨ GLM-PC by Zhipu AI
https://cogagent.aminer.cn

Dataset:
✨ Fineweb-Edu-Chinese by Opencsg
opencsg/Fineweb-Edu-Chinese-V2.1
✨ Multimodal_textbook by Alibaba
DAMO-NLP-SG/multimodal_textbook
✨ MME-Finance by Hithink AI
Β·
posted an update 13 days ago
view post
Post
8220
Yes, DeepSeek R1's release is impressive. But the real story is what happened in just 7 days after:

- Original release: 8 models, 540K downloads. Just the beginning...

- The community turned those open-weight models into +550 NEW models on Hugging Face. Total downloads? 2.5Mβ€”nearly 5X the originals.

The reason? DeepSeek models are open-weight, letting anyone build on top of them. Interesting to note that the community focused on quantized versions for better efficiency & accessibility. They want models that use less memory, run faster, and are more energy-efficient.

When you empower builders, innovation explodes. For everyone. πŸš€

The most popular community model? @bartowski 's DeepSeek-R1-Distill-Qwen-32B-GGUF version β€” 1M downloads alone.
Β·
posted an update 19 days ago
reacted to AdinaY's post with πŸ”₯ 20 days ago
view post
Post
2821
BIG release by DeepSeek AIπŸ”₯πŸ”₯πŸ”₯

DeepSeek-R1 & DeepSeek-R1-Zero: two 660B reasoning models are here, alongside 6 distilled dense models (based on Llama & Qwen) for the community!
https://huggingface.co/deepseek-ai
deepseek-ai/DeepSeek-R1

✨ MIT License : enabling distillation for custom models
✨ 32B & 70B models match OpenAI o1-mini in multiple capabilities
✨ API live now! Access Chain of Thought reasoning with model='deepseek-reasoner'
posted an update 20 days ago
view post
Post
1824
Reminder: Don’t. Use. ChatGPT. As. A. Calculator. Seriously. πŸ€–

Loved listening to @sasha on Hard Forkβ€”it really made me think.

A few takeaways that hit home:
- Individual culpability only gets you so far. The real priority: demanding accountability and transparency from companies.
- Evaluate if generative AI is the right tool for certain tasks (like search) before using it.

Curious about the full conversation? https://www.nytimes.com/2025/01/17/podcasts/hardfork-tiktok-rednote-environment.html. Give it a listenβ€”it’s worth it! 🌍
  • 1 reply
Β·
reacted to merve's post with ❀️ 23 days ago
view post
Post
2566
Everything that happened this week in open AI, a recap 🀠 merve/jan-17-releases-678a673a9de4a4675f215bf5

πŸ‘€ Multimodal
- MiniCPM-o 2.6 is a new sota any-to-any model by OpenBMB
(vision, speech and text!)
- VideoChat-Flash-Qwen2.5-2B is new video multimodal models by OpenGVLab that come in sizes 2B & 7B in resolutions 224 & 448
- ByteDance released larger SA2VA that comes in 26B parameters
- Dataset: VRC-Bench is a new diverse benchmark for multimodal LLM reasoning performance

πŸ’¬ LLMs
- MiniMax-Text-01 is a new huge language model (456B passive 45.9B active params) by MiniMaxAI with context length of 4M tokens 🀯
- Dataset: Sky-T1-data-17k is a diverse dataset used to train Sky-T1-32B
- kyutai released Helium-1-Preview-2B is a new small multilingual LM
- Wayfarer-12B is a new LLM able to write D&D πŸ§™πŸ»β€β™‚οΈ
- ReaderLM-v2 is a new HTML parsing model by Jina AI

- Dria released, Dria-Agent-a-3B, new agentic coding model (Pythonic function calling) based on Qwen2.5 Coder
- Unsloth released Phi-4, faster and memory efficient Llama 3.3

πŸ–ΌοΈ Vision
- MatchAnything is a new foundation model for matching
- FitDit is a high-fidelity VTON model based on DiT architecture

πŸ—£οΈ Audio
- OuteTTS-0.3-1B is a new multilingual text-to-speech model with voice cloning and emotion control capabilities

πŸ“– Retrieval
- lightblue released a new reranker based on Qwen2.5 LB-reranker-0.5B-v1.0 that can handle 95+ languages
- cde-small-v2 is a new sota small retrieval model by
@jxm
posted an update 25 days ago
view post
Post
1767
AI agents are coming. But who's in control?

@meg , one of the best researchers in AI ethics, makes a critical point about autonomy: fully autonomous systems carry unknowable risks because they operate on computer logic rather than human logic.

The solution? Build systems that support & assist rather than override human decisions.

I highly recommend reading the blog post written by Meg, @evijit @sasha and @giadap . They define different levels of agent autonomy & provide a values-based analysis of risks, benefits, and uses of AI agents to help you make better decisions.

πŸ‘‰ https://huggingface.co/blog/ethics-soc-7

reacted to AdinaY's post with πŸ”₯ 26 days ago
posted an update 26 days ago
view post
Post
2320
πŸ”₯ The AI Agent hype is real! This blog post deep dives into everything you need to know before deploying them: from key definitions to practical recommendations. A must-read for anyone building the future of autonomous systems.

πŸ“Š Key insight: A clear table breaking down the 5 levels of AI agents - from simple processors to fully autonomous systems. Essential framework for understanding where your agent stands on the autonomy spectrum

βš–οΈ Deep analysis of 15 core values reveals critical trade-offs: accuracy, privacy, safety, equity & more. The same features that make agents powerful can make them risky. Understanding these trade-offs is crucial for responsible deployment

🎯 6 key recommendations for the road ahead:
- Create rigorous evaluation protocols
- Study societal effects
- Understand ripple effects
- Improve transparency
- Open source can make a positive difference
- Monitor base model evolution

Read the blog post: https://huggingface.co/blog/ethics-soc-7 Brillant work by @meg @evijit @sasha @giadap
reacted to MoritzLaurer's post with ❀️ 28 days ago
view post
Post
3270
FACTS is a great paper from @GoogleDeepMind on measuring the factuality of LLM outputs. You can now download their prompt templates from @huggingface to improve LLM-based fact-checking yourself!

πŸ“ The paper introduces the FACTS Grounding benchmark for evaluating the factuality of LLM outputs.

πŸ€– Fact-checking is automated by an ensemble of LLM judges that verify if a response is fully grounded in a factual reference document.

πŸ§ͺ The authors tested different prompt templates on held-out data to ensure their generalization.

πŸ“š It's highly educational to read these templates to learn how frontier labs design prompts and understand their limitations.

πŸ’Ύ You can now download and reuse these prompt templates via the prompt-templates library!

πŸ”„ The library simplifies sharing prompt templates on the HF hub or locally via standardized YAML files. Let’s make LLM work more transparent and reproducible by sharing more templates like this!

Links πŸ‘‡
- prompt-templates docs: https://moritzlaurer.github.io/prompt_templates/
- all templates on the HF Hub: MoritzLaurer/facts-grounding-prompts
- FACTS paper: https://storage.googleapis.com/deepmind-media/FACTS/FACTS_grounding_paper.pdf
posted an update about 2 months ago
view post
Post
1391
πŸ” From instruction-following to creative storytelling, dive into 2024's most impactful AI datasets! These gems are shaping everything from scientific research to video understanding.

Check it out: huggingface/open-source-ai-year-in-review-2024
posted an update about 2 months ago
view post
Post
1320
🀝 Want to share your AI models while protecting your work? Licenses are key!

Fascinating to see that nearly 60% of models on the Hub use Apache & MIT licenses.

Explore the viz here: huggingface/open-source-ai-year-in-review-2024