AdinaY (Adina Yakefu)

upvoted 2 papers about 2 hours ago

Beyond Release: Access Considerations for Generative AI Systems

Paper • 2502.16701 • Published 2 days ago • 8

Forecasting Open-Weight AI Model Growth on Hugging Face

Paper • 2502.15987 • Published 4 days ago • 7

posted an update about 4 hours ago

Post

301

Wan2.1 🔥📹 new OPEN video model by Alibaba Wan team!

Model: Wan-AI/Wan2.1-T2V-14B
Demo: Wan-AI/Wan2.1

✨Apache 2.0
✨8.19GB VRAM, runs on most GPUs
✨Multi-Tasking: T2V, I2V, Video Editing, T2I, V2A
✨Text Generation: Supports Chinese & English
✨Powerful Video VAE: Encode/decode 1080P w/ temporal precision

updated a collection about 4 hours ago

2025 February

Collection

19 items • Updated about 4 hours ago • 1

liked a Space about 4 hours ago

203

Wan2.1

💻

Wan: Open and Advanced Large-Scale Video Generative Models

posted an update about 21 hours ago

Post

1526

Try QwQ-Max-Preview, Qwen's reasoning model here👉 https://chat.qwen.ai
Can't wait for the model weights to drop on the Hugging Face Hub 🔥

1 reply

·

reacted to fdaudens's post with ❤️ about 21 hours ago

Post

1827

🚀 Just launched: A toolkit of 20 powerful AI tools that journalists can use right now - transcribe, analyze, create. 100% free & open-source.

Been testing all these tools myself and created a searchable collection of the most practical ones - from audio transcription to image generation to document analysis. No coding needed, no expensive subscriptions.

Some highlights I've tested personally:
- Private, on-device transcription with speaker ID in 100+ languages using Whisper
- Website scraping that just works - paste a URL, get structured data
- Local image editing with tools like Finegrain (impressive results)
- Document chat using Qwen 2.5 72B (handles technical papers well)

Sharing this early because the best tools come from the community. Drop your favorite tools in the comments or join the discussion on what to add next!

👉 JournalistsonHF/ai-toolkit

upvoted 2 papers 1 day ago

MoBA: Mixture of Block Attention for Long-Context LLMs

Paper • 2502.13189 • Published 7 days ago • 12

Presumed Cultural Identity: How Names Shape LLM Responses

Paper • 2502.11995 • Published 8 days ago • 10

updated a collection 1 day ago

2025 February

Collection

19 items • Updated about 4 hours ago • 1

liked a Space 1 day ago

3

AttentiveEraser - Object Remover

🚀

Unleashing Diffusion Model’s Object Removal Potential

upvoted a paper 1 day ago

MaskGWM: A Generalizable Driving World Model with Video Mask Reconstruction

Paper • 2502.11663 • Published 8 days ago • 36

reacted to their post with 🔥 1 day ago

Post

1798

Two AI startups, DeepSeek & Moonshot AI , keep moving in perfect sync 👇

✨ Last December: DeepSeek & Moonshot AI released their reasoning models on the SAME DAY.
DeepSeek: deepseek-ai/DeepSeek-R1
MoonShot: https://github.com/MoonshotAI/Kimi-k1.5

✨ Last week: Both teams published papers on modifying attention mechanisms on the SAME DAY AGAIN.
DeepSeek: Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention (2502.11089)
Moonshot: MoBA: Mixture of Block Attention for Long-Context LLMs (2502.13189)

✨ TODAY:
DeepSeek unveiled Flash MLA: a efficient MLA decoding kernel for NVIDIA Hopper GPUs, optimized for variable-length sequences.
https://github.com/deepseek-ai/FlashMLA

Moonshot AI introduces Moonlight: a 3B/16B MoE trained on 5.7T tokens using Muon, pushing the Pareto frontier with fewer FLOPs.
moonshotai/Moonlight-16B-A3B

What's next? 👀

posted an update 1 day ago

Post

1798

Two AI startups, DeepSeek & Moonshot AI , keep moving in perfect sync 👇

✨ Last December: DeepSeek & Moonshot AI released their reasoning models on the SAME DAY.
DeepSeek: deepseek-ai/DeepSeek-R1
MoonShot: https://github.com/MoonshotAI/Kimi-k1.5

✨ Last week: Both teams published papers on modifying attention mechanisms on the SAME DAY AGAIN.
DeepSeek: Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention (2502.11089)
Moonshot: MoBA: Mixture of Block Attention for Long-Context LLMs (2502.13189)

✨ TODAY:
DeepSeek unveiled Flash MLA: a efficient MLA decoding kernel for NVIDIA Hopper GPUs, optimized for variable-length sequences.
https://github.com/deepseek-ai/FlashMLA

Moonshot AI introduces Moonlight: a 3B/16B MoE trained on 5.7T tokens using Muon, pushing the Pareto frontier with fewer FLOPs.
moonshotai/Moonlight-16B-A3B

What's next? 👀