199 131 502

Nishith Jain

KingNish

AI & ML interests

AI is fun actually. Busy till June 2025.

Recent Activity

liked a model about 8 hours ago

Efficient-Large-Model/Sana_1600M_1024px

liked a model about 8 hours ago

fluently-lm/FluentlyLM-Prinum

reacted to ehristoforu's post with 🔥 about 8 hours ago

Introducing our first standalone model – FluentlyLM Prinum Introducing the first standalone model from Project Fluently LM! We worked on it for several months, used different approaches and eventually found the optimal one. General characteristics: - Model type: Causal language models (QwenForCausalLM, LM Transformer) - Number of parameters: 32.5B - Number of parameters (not embedded): 31.0B - Number of layers: 64 - Context: 131,072 tokens - Language(s) (NLP): English, French, Spanish, Russian, Chinese, Japanese, Persian (officially supported) - License: MIT Creation strategy: The basis of the strategy is shown in Pic. 2. We used Axolotl & Unsloth for SFT-finetuning with PEFT LoRA (rank=64, alpha=64) and Mergekit for SLERP and TIES mergers. Evolution: 🏆 12th place in the Open LLM Leaderboard (https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#) (21.02.2025) Detailed results and comparisons are presented in Pic. 3. Links: - Model: https://huggingface.co/fluently-lm/FluentlyLM-Prinum - GGUF version: https://huggingface.co/mradermacher/FluentlyLM-Prinum-GGUF - Demo on ZeroGPU: https://huggingface.co/spaces/ehristoforu/FluentlyLM-Prinum-demo

View all activity

Organizations

KingNish's activity

liked 2 models about 8 hours ago

Efficient-Large-Model/Sana_1600M_1024px

Text-to-Image • Updated Jan 10 • 3.77k • 193

fluently-lm/FluentlyLM-Prinum

Text Generation • Updated 2 days ago • 173 • 8

reacted to ehristoforu's post with 🔥 about 8 hours ago

Post

1910

Introducing our first standalone model – FluentlyLM Prinum

Introducing the first standalone model from Project Fluently LM! We worked on it for several months, used different approaches and eventually found the optimal one.

General characteristics:
- Model type: Causal language models (QwenForCausalLM, LM Transformer)
- Number of parameters: 32.5B
- Number of parameters (not embedded): 31.0B
- Number of layers: 64
- Context: 131,072 tokens
- Language(s) (NLP): English, French, Spanish, Russian, Chinese, Japanese, Persian (officially supported)
- License: MIT

Creation strategy:
The basis of the strategy is shown in Pic. 2.
We used Axolotl & Unsloth for SFT-finetuning with PEFT LoRA (rank=64, alpha=64) and Mergekit for SLERP and TIES mergers.

Evolution:
🏆 12th place in the Open LLM Leaderboard ( open-llm-leaderboard/open_llm_leaderboard) (21.02.2025)

Detailed results and comparisons are presented in Pic. 3.

Links:
- Model: fluently-lm/FluentlyLM-Prinum
- GGUF version: mradermacher/FluentlyLM-Prinum-GGUF
- Demo on ZeroGPU: ehristoforu/FluentlyLM-Prinum-demo

2 replies

liked 2 models about 8 hours ago

Wan-AI/Wan2.1-T2V-1.3B

Text-to-Video • Updated about 5 hours ago • 69

Wan-AI/Wan2.1-T2V-14B

Text-to-Video • Updated about 5 hours ago • 179

upvoted an article 1 day ago

Article

Remote VAEs for decoding with HF endpoints 🤗

2 days ago

• 25

updated a collection 1 day ago

MTP

Collection

3 items • Updated 1 day ago

upvoted 2 papers 1 day ago

Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads

Paper • 2401.10774 • Published Jan 19, 2024 • 55

Hydra: Sequentially-Dependent Draft Heads for Medusa Decoding

Paper • 2402.05109 • Published Feb 7, 2024 • 1

upvoted an article 3 days ago

Article

SmolVLM2: Bringing Video Understanding to Every Device

6 days ago

• 164

reacted to lysandre's post with ❤️ 4 days ago

Post

5097

SmolVLM-2 and SigLIP-2 are now part of transformers in dedicated releases!

They're added on top of the v4.49.0 release, and can be installed from the following tags: v4.49.0-SmolVLM-2 and v4.49.0-SigLIP-2.

This marks a new beginning for the release process of transformers. For the past five years, we've been doing monthly releases featuring many models (v4.49.0, the latest release, features 9 new architectures).

Starting with SmolVLM-2 & SigLIP2, we'll now additionally release tags supporting new models on a stable branch. These models are therefore directly available for use by installing from the tag itself. These tags will continue to be updated with fixes applied to these models.

Going forward, continue expecting software releases following semantic versioning: v4.50.0 will have ~10 new architectures compared to v4.49.0, as well as a myriad of new features, improvements and bug fixes. Accompanying these software releases, we'll release tags offering brand new models as fast as possible, to make them accessible to all immediately.

1 reply

upvoted an article 5 days ago

Article

The Large Language Model Course

•

Jan 16

• 113

reacted to mlabonne's post with 🤗 5 days ago

Post

5122

🆕 LLM Course 2025 edition!

I updated the LLM Scientist roadmap and added a ton of new information and references. It covers training, datasets, evaluation, quantization, and new trends like test-time compute scaling.

The LLM Course has been incredibly popular (41.3k stars!) and I've been touched to receive many, many messages about how it helped people in their careers.

I know how difficult this stuff can be, so I'm super proud of the impact it had. I want to keep updating it in 2025, especially with the LLM Engineer roadmap.

Thanks everyone, hope you'll enjoy it!

💻 LLM Course: https://huggingface.co/blog/mlabonne/llm-course

upvoted a paper 7 days ago

Kimi k1.5: Scaling Reinforcement Learning with LLMs

Paper • 2501.12599 • Published Jan 22 • 98

liked a Space 8 days ago

Mixture Of Diffusers SDXL Tiling

🚀

Mixture of Diffusers implementation for XL Stable Diffusion

reacted to m-ric's post with ❤️ 11 days ago

Post

2751

𝗚𝗿𝗲𝗮𝘁 𝗳𝗲𝗮𝘁𝘂𝗿𝗲 𝗮𝗹𝗲𝗿𝘁: you can now share agents to the Hub! 🥳🥳

And any agent pushed to Hub get a cool Space interface to directly chat with it.

This was a real technical challenge: for instance, serializing tools to export them meant that you needed to get all the source code for a tool, verify that it was standalone (not relying on external variables), and gathering all the packages required to make it run.

Go try it out! 👉 https://github.com/huggingface/smolagents

2 replies

liked a model 11 days ago

zed-industries/zeta

Updated 11 days ago • 1.12k • 204

updated a model 11 days ago

KingNish/modernbert

Fill-Mask • Updated 11 days ago • 11

published a model 11 days ago

KingNish/modernbert

Fill-Mask • Updated 11 days ago • 11