Djuunaa

djuna

AI & ML interests

None yet

Recent Activity

liked a model about 17 hours ago
cyberagent/DeepSeek-R1-Distill-Qwen-32B-Japanese
liked a model about 17 hours ago
SmallDoge/Doge-60M-Instruct
liked a model about 22 hours ago
cognitivecomputations/DeepSeek-R1-AWQ
View all activity

Organizations

Djuna Test Lab's profile picture

djuna's activity

reacted to Jaward's post with 🔥 3 days ago
view post
Post
1379
The beauty in GRPO is the fact that it doesn’t care if the rewards are rule-based or learned, the hack: let the data self-normalize— trajectories in a batch compete against their mean, no value model, no extra params, just clean, efficient RL that cuts memory usage by 50%, while maintaining SOTA performance. btw it was introduced 9months prior to R1: arxiv.org/pdf/2402.03300
  • 1 reply
·
New activity in tugstugi/Qwen2.5-Coder-0.5B-QwQ-draft 3 days ago

Tokenizer Details

7
#2 opened 12 days ago by
qingy2024
reacted to Bils's post with 🔥 4 days ago
view post
Post
1983
🚀 We're excited to share major improvements to our Janus-Pro-7B Text-to-Image Generation Space!
🎨What's New:
1-Critical Bug Fixes
2-Enhanced Features
3-UI Improvements
4-Performance Boost
Try It Now:
Bils/DeepseekJanusPro-Image
reacted to fdaudens's post with 🔥 4 days ago
view post
Post
2976
🎯 Kokoro TTS just hit v1.0! 🚀

Small but mighty: 82M parameters, runs locally, speaks multiple languages. The best part? It's Apache 2.0 licensed!
This could unlock so many possibilities ✨

Check it out: hexgrad/Kokoro-82M
  • 1 reply
·
replied to mkurman's post 5 days ago
New activity in djuna/TEST-Q2.5-Lenned-14B 6 days ago

Update config.json

#1 opened 6 days ago by
djuna