M4-ai

community

M4-ai

Activity Feed Request to join this org

AI & ML interests

Small LMs for small computers

Recent Activity

mmhamdy authored a paper 25 days ago

Bridging the Data Provenance Gap Across Text, Speech and Video

mmhamdy authored a paper about 2 months ago

Surveying the Effects of Quality, Diversity, and Complexity in Synthetic Data From Large Language Models

Locutusque updated a model 3 months ago

M4-ai/TinyMistral-248M-v3

View all activity

M4-ai's activity

KnutJaegersberg

posted an update about 20 hours ago

Post

804

DeepSeek R1 on how to build conscious AGI

https://huggingface.co/blog/KnutJaegersberg/deepseek-r1-on-conscious-agi

3 replies

AtAndDev

posted an update 2 days ago

Post

389

Deepseek gang on fire fr fr

KnutJaegersberg

posted an update 4 days ago

Post

443

Yet another blog post about general intelligence

https://huggingface.co/blog/KnutJaegersberg/general-intelligence

prithivMLmods

posted an update 4 days ago

Post

2788

Q'n' Sketches ❤️‍🔥

🖼️ Adapters:
- Qs : strangerzonehf/Qs-Sketch
- Qd : strangerzonehf/Qd-Sketch
- Qx : strangerzonehf/Qx-Art
- Qc : strangerzonehf/Qc-Sketch
- Bb : strangerzonehf/Bg-Bag

🐍 Collection : strangerzonehf/q-series-sketch-678e3503bf3a661758429717

🔗Page : https://huggingface.co/strangerzonehf

.
.
.
@prithivMLmods 🤗

AtAndDev

posted an update 5 days ago

Post

1531

R1 is out! And with a lot of other R1 releated models...

KnutJaegersberg

posted an update 5 days ago

Post

1730

Understanding and Benchmarking Artificial Intelligence: OpenAI's o3 Is Not AGI

It's an interesting paper that argues "new approaches are required that can reliably solve a wide variety of problems without existing skills."
"It is therefore hoped that the benchmark outlined in this article contributes to further exploration of this direction of research and incentivises the development of new AGI approaches that focus on intelligence rather than skills."

https://arxiv.org/abs/2501.07458

not-lain

posted an update 8 days ago

Post

988

we now have more than 2000 public AI models using ModelHubMixin🤗

prithivMLmods

posted an update 8 days ago

Post

2484

ChemQwen-vL [ Qwen for Chem Vision ] 🧑🏻‍🔬

🧪Model : prithivMLmods/ChemQwen-vL

📝ChemQwen-vL is a vision-language model fine-tuned based on the Qwen2VL-2B Instruct model. It has been trained using the International Chemical Identifier (InChI) format for chemical compounds and is optimized for chemical compound identification. The model excels at generating the InChI and providing descriptions of chemical compounds based on their images. Its architecture operates within a multi-modal framework, combining image-text-text capabilities. It has been fine-tuned using datasets from: https://iupac.org/projects/

📒Colab Demo: https://tinyurl.com/2pn8x6u7, Collection : https://tinyurl.com/2mt5bjju

Inference with the documentation is possible with the help of the ReportLab library. https://pypi.org/project/reportlab/

🤗: @prithivMLmods

1 reply

Tonic

posted an update 9 days ago

Post

1419

🙋🏻‍♂️ Hey there folks ,

Facebook AI just released JASCO models that make music stems .

you can try it out here : Tonic/audiocraft

hope you like it

Tonic

posted an update 11 days ago

Post

2347

🙋🏻‍♂️Hey there folks , Open LLM Europe just released Lucie 7B-Instruct model , a billingual instruct model trained on open data ! You can check out my unofficial demo here while we wait for the official inference api from the group : Tonic/Lucie-7B hope you like it 🚀

KnutJaegersberg

posted an update 12 days ago

Post

614

prithivMLmods/Phi-4-o1

prithivMLmods/Phi-4-o1

not-lain

posted an update 13 days ago

Post

3817

Published a new blogpost 📖
In this blogpost I have gone through the transformers' architecture emphasizing how shapes propagate throughout each layer.
🔗 https://huggingface.co/blog/not-lain/tensor-dims
some interesting takeaways :

Sri-Vigneshwar-DJ

posted an update 15 days ago

Post

634

Checkout phi-4 from Microsoft, dropped a day ago... If you ❤️ the Phi series, then here is the GGUF - Sri-Vigneshwar-DJ/phi-4-GGUF. phi-4 is a 14B highly efficient open LLM that beats much larger models at math and reasoning - check out evaluations on the Open LLM.

Technical paper - https://arxiv.org/pdf/2412.08905 ; The Data Synthesis approach is interesting

prithivMLmods

posted an update 16 days ago

Post

3333

200+ f{🤗} on Stranger Zone! [ https://huggingface.co/strangerzonehf ]

❤️‍🔥Stranger Zone's MidJourney Mix Model Adapter is trending on the Very Model Page, with over 45,000+ downloads. Additionally, the Super Realism Model Adapter has over 52,000+ downloads, remains the top two adapter on Stranger Zone!
strangerzonehf/Flux-Midjourney-Mix2-LoRA, strangerzonehf/Flux-Super-Realism-LoRA

👽Try Demo: prithivMLmods/FLUX-LoRA-DLC

📦Most Recent Adapters to Check Out :
+ Ctoon : strangerzonehf/Ctoon-Plus-Plus
+ Cardboard : strangerzonehf/Flux-Cardboard-Art-LoRA
+ Claude Art : strangerzonehf/Flux-Claude-Art
+ Flay Lay : strangerzonehf/Flux-FlatLay-LoRA
+ Smiley Portrait : strangerzonehf/Flux-Smiley-Portrait-LoRA

🤗Thanks for Community & OPEN SOURCEEE !!

6 replies

Tonic

posted an update 17 days ago

Post

1661

microsoft just released Phi-4 , check it out here : Tonic/Phi-4

hope you like it :-)

Sri-Vigneshwar-DJ

posted an update 18 days ago

Post

2050

Just sharing a thought: I started using DeepSeek V3 a lot, and an idea struck me about agents "orchestrating during inference" on a test-time compute model like DeepSeek V3 or the O1 series.

Agents (Instruction + Function Calls + Memory) execute during inference, and based on the output decision, a decision is made to scale the time to reason or perform other tasks.

prithivMLmods

posted an update 19 days ago

Post

5883

Reasoning SmolLM2 🚀

🎯Fine-tuning SmolLM2 on a lightweight synthetic reasoning dataset for reasoning-specific tasks. Future updates will focus on lightweight, blazing-fast reasoning models. Until then, check out the blog for fine-tuning details.

🔥Blog : https://huggingface.co/blog/prithivMLmods/smollm2-ft

🔼 Models :
+ SmolLM2-CoT-360M : prithivMLmods/SmolLM2-CoT-360M
+ Reasoning-SmolLM2-135M : prithivMLmods/Reasoning-SmolLM2-135M
+ SmolLM2-CoT-360M-GGUF : prithivMLmods/SmolLM2-CoT-360M-GGUF

🤠 Other Details :
+ Demo : prithivMLmods/SmolLM2-CoT-360M
+ Fine-tune nB : prithivMLmods/SmolLM2-CoT-360M

Sri-Vigneshwar-DJ

posted an update 20 days ago

Post

2338

Combining smolagents with Anthropic’s best practices simplifies building powerful AI agents:

1. Code-Based Agents: Write actions as Python code, reducing steps by 30%.
2. Prompt Chaining: Break tasks into sequential subtasks with validation gates.
3. Routing: Classify inputs and direct them to specialized handlers.
4. Fallback: Handle tasks even if classification fails.

https://huggingface.co/blog/Sri-Vigneshwar-DJ/building-effective-agents-with-anthropics-best-pra

prithivMLmods

posted an update 24 days ago

Post

3860

Triangulum Catalogued 🔥💫

🎯Triangulum is a collection of pretrained and instruction-tuned generative models, designed for multilingual applications. These models are trained using synthetic datasets based on long chains of thought, enabling them to perform complex reasoning tasks effectively.

+ Triangulum-10B : prithivMLmods/Triangulum-10B
+ Quants : prithivMLmods/Triangulum-10B-GGUF

+ Triangulum-5B : prithivMLmods/Triangulum-5B
+ Quants : prithivMLmods/Triangulum-5B-GGUF

+ Triangulum-1B : prithivMLmods/Triangulum-1B
+ Quants : prithivMLmods/Triangulum-1B-GGUF

4 replies

mmhamdy

authored a paper 25 days ago

Bridging the Data Provenance Gap Across Text, Speech and Video

Paper • 2412.17847 • Published Dec 19, 2024 • 8

AI & ML interests

Recent Activity

Team members 37

M4-ai's activity