AI & ML interests

Small LMs for small computers

Recent Activity

M4-ai's activity

KnutJaegersberg 
posted an update about 20 hours ago
AtAndDev 
posted an update 2 days ago
view post
Post
389
Deepseek gang on fire fr fr
KnutJaegersberg 
posted an update 4 days ago
prithivMLmods 
posted an update 4 days ago
AtAndDev 
posted an update 5 days ago
view post
Post
1531
R1 is out! And with a lot of other R1 releated models...
KnutJaegersberg 
posted an update 5 days ago
view post
Post
1730
Understanding and Benchmarking Artificial Intelligence: OpenAI's o3 Is Not AGI

It's an interesting paper that argues "new approaches are required that can reliably solve a wide variety of problems without existing skills."
"It is therefore hoped that the benchmark outlined in this article contributes to further exploration of this direction of research and incentivises the development of new AGI approaches that focus on intelligence rather than skills."

https://arxiv.org/abs/2501.07458
not-lain 
posted an update 8 days ago
view post
Post
988
we now have more than 2000 public AI models using ModelHubMixin🤗
prithivMLmods 
posted an update 8 days ago
view post
Post
2484
ChemQwen-vL [ Qwen for Chem Vision ] 🧑🏻‍🔬

🧪Model : prithivMLmods/ChemQwen-vL

📝ChemQwen-vL is a vision-language model fine-tuned based on the Qwen2VL-2B Instruct model. It has been trained using the International Chemical Identifier (InChI) format for chemical compounds and is optimized for chemical compound identification. The model excels at generating the InChI and providing descriptions of chemical compounds based on their images. Its architecture operates within a multi-modal framework, combining image-text-text capabilities. It has been fine-tuned using datasets from: https://iupac.org/projects/

📒Colab Demo: https://tinyurl.com/2pn8x6u7, Collection : https://tinyurl.com/2mt5bjju

Inference with the documentation is possible with the help of the ReportLab library. https://pypi.org/project/reportlab/

🤗: @prithivMLmods
  • 1 reply
·
Tonic 
posted an update 9 days ago
view post
Post
1419
🙋🏻‍♂️ Hey there folks ,

Facebook AI just released JASCO models that make music stems .

you can try it out here : Tonic/audiocraft

hope you like it
Tonic 
posted an update 11 days ago
view post
Post
2347
🙋🏻‍♂️Hey there folks , Open LLM Europe just released Lucie 7B-Instruct model , a billingual instruct model trained on open data ! You can check out my unofficial demo here while we wait for the official inference api from the group : Tonic/Lucie-7B hope you like it 🚀
KnutJaegersberg 
posted an update 12 days ago
not-lain 
posted an update 13 days ago
view post
Post
3817
Published a new blogpost 📖
In this blogpost I have gone through the transformers' architecture emphasizing how shapes propagate throughout each layer.
🔗 https://huggingface.co/blog/not-lain/tensor-dims
some interesting takeaways :
Sri-Vigneshwar-DJ 
posted an update 15 days ago
view post
Post
634
Checkout phi-4 from Microsoft, dropped a day ago... If you ❤️ the Phi series, then here is the GGUF - Sri-Vigneshwar-DJ/phi-4-GGUF. phi-4 is a 14B highly efficient open LLM that beats much larger models at math and reasoning - check out evaluations on the Open LLM.

Technical paper - https://arxiv.org/pdf/2412.08905 ; The Data Synthesis approach is interesting
prithivMLmods 
posted an update 16 days ago
view post
Post
3333
200+ f{🤗} on Stranger Zone! [ https://huggingface.co/strangerzonehf ]

❤️‍🔥Stranger Zone's MidJourney Mix Model Adapter is trending on the Very Model Page, with over 45,000+ downloads. Additionally, the Super Realism Model Adapter has over 52,000+ downloads, remains the top two adapter on Stranger Zone!
strangerzonehf/Flux-Midjourney-Mix2-LoRA, strangerzonehf/Flux-Super-Realism-LoRA

👽Try Demo: prithivMLmods/FLUX-LoRA-DLC

📦Most Recent Adapters to Check Out :
+ Ctoon : strangerzonehf/Ctoon-Plus-Plus
+ Cardboard : strangerzonehf/Flux-Cardboard-Art-LoRA
+ Claude Art : strangerzonehf/Flux-Claude-Art
+ Flay Lay : strangerzonehf/Flux-FlatLay-LoRA
+ Smiley Portrait : strangerzonehf/Flux-Smiley-Portrait-LoRA

🤗Thanks for Community & OPEN SOURCEEE !!
  • 6 replies
·
Tonic 
posted an update 17 days ago
view post
Post
1661
microsoft just released Phi-4 , check it out here : Tonic/Phi-4

hope you like it :-)
Sri-Vigneshwar-DJ 
posted an update 18 days ago
view post
Post
2050
Just sharing a thought: I started using DeepSeek V3 a lot, and an idea struck me about agents "orchestrating during inference" on a test-time compute model like DeepSeek V3 or the O1 series.

Agents (Instruction + Function Calls + Memory) execute during inference, and based on the output decision, a decision is made to scale the time to reason or perform other tasks.
prithivMLmods 
posted an update 19 days ago
view post
Post
5883
Reasoning SmolLM2 🚀

🎯Fine-tuning SmolLM2 on a lightweight synthetic reasoning dataset for reasoning-specific tasks. Future updates will focus on lightweight, blazing-fast reasoning models. Until then, check out the blog for fine-tuning details.

🔥Blog : https://huggingface.co/blog/prithivMLmods/smollm2-ft

🔼 Models :
+ SmolLM2-CoT-360M : prithivMLmods/SmolLM2-CoT-360M
+ Reasoning-SmolLM2-135M : prithivMLmods/Reasoning-SmolLM2-135M
+ SmolLM2-CoT-360M-GGUF : prithivMLmods/SmolLM2-CoT-360M-GGUF

🤠 Other Details :
+ Demo : prithivMLmods/SmolLM2-CoT-360M
+ Fine-tune nB : prithivMLmods/SmolLM2-CoT-360M




Sri-Vigneshwar-DJ 
posted an update 20 days ago
view post
Post
2338
Combining smolagents with Anthropic’s best practices simplifies building powerful AI agents:

1. Code-Based Agents: Write actions as Python code, reducing steps by 30%.
2. Prompt Chaining: Break tasks into sequential subtasks with validation gates.
3. Routing: Classify inputs and direct them to specialized handlers.
4. Fallback: Handle tasks even if classification fails.

https://huggingface.co/blog/Sri-Vigneshwar-DJ/building-effective-agents-with-anthropics-best-pra
prithivMLmods 
posted an update 24 days ago
view post
Post
3860
Triangulum Catalogued 🔥💫

🎯Triangulum is a collection of pretrained and instruction-tuned generative models, designed for multilingual applications. These models are trained using synthetic datasets based on long chains of thought, enabling them to perform complex reasoning tasks effectively.

+ Triangulum-10B : prithivMLmods/Triangulum-10B
+ Quants : prithivMLmods/Triangulum-10B-GGUF

+ Triangulum-5B : prithivMLmods/Triangulum-5B
+ Quants : prithivMLmods/Triangulum-5B-GGUF

+ Triangulum-1B : prithivMLmods/Triangulum-1B
+ Quants : prithivMLmods/Triangulum-1B-GGUF
·