Marvin73 (Fabio Ferrua )

liked a model 11 days ago

mistralai/Mistral-Small-24B-Instruct-2501

Text Generation • Updated 9 days ago • 263k • • 691

liked a model about 1 month ago

microsoft/phi-4

Text Generation • Updated 7 days ago • 572k • 1.71k

upvoted an article 4 months ago

Article

Fine-tune Llama 3.1 Ultra-Efficiently with Unsloth

By

•

Jul 29, 2024

• 273

reacted to anakin87's post with 👍 5 months ago

Post

1747

🕵🏻 𝐀𝐠𝐞𝐧𝐭𝐢𝐜 𝐑𝐀𝐆 𝐰𝐢𝐭𝐡 🦙 𝐋𝐥𝐚𝐦𝐚 3.2

I was excited to explore Llama 3.2, but as a simple 🇪🇺 EU guy, I don't have access to Meta's multimodal models 😿

🤔 So I thought: why not challenge the small 3B text model with Agentic RAG?

🎯 The plan:
- Build a system that tries to answer questions using a knowledge base.
- If the documents don't contain the answer, use Web search for additional context.

Check out my experimental notebook here: 📓 https://colab.research.google.com/github/deepset-ai/haystack-cookbook/blob/main/notebooks/llama32_agentic_rag.ipynb

My stack:
🏗️ haystack (https://haystack.deepset.ai/): open-source LLM orchestration framework
🦙 meta-llama/Llama-3.2-3B-Instruct
🦆🌐 free DuckDuckGo API, integrated with Haystack

✨ 𝘛𝘩𝘦 𝘳𝘦𝘴𝘶𝘭𝘵𝘴? 𝘌𝘯𝘤𝘰𝘶𝘳𝘢𝘨𝘪𝘯𝘨 - 𝘢 𝘧𝘦𝘸 𝘮𝘰𝘯𝘵𝘩𝘴 𝘢𝘨𝘰, 𝘵𝘩𝘪𝘴 𝘭𝘦𝘷𝘦𝘭 𝘰𝘧 𝘱𝘦𝘳𝘧𝘰𝘳𝘮𝘢𝘯𝘤𝘦 𝘧𝘳𝘰𝘮 𝘢 𝘴𝘮𝘢𝘭𝘭 𝘮𝘰𝘥𝘦𝘭 𝘸𝘰𝘶𝘭𝘥'𝘷𝘦 𝘣𝘦𝘦𝘯 𝘶𝘯𝘵𝘩𝘪𝘯𝘬𝘢𝘣𝘭𝘦!
This probably reflects the impressive IFEval score of the model (comparable to Llama 3.1 8B).

liked 3 models 5 months ago

reacted to Xenova's post with 🔥 6 months ago

Post

15038

I'm excited to announce that Transformers.js V3 is finally available on NPM! 🔥 State-of-the-art Machine Learning for the web, now with WebGPU support! 🤯⚡️

Install it from NPM with:
𝚗𝚙𝚖 𝚒 @𝚑𝚞𝚐𝚐𝚒𝚗𝚐𝚏𝚊𝚌𝚎/𝚝𝚛𝚊𝚗𝚜𝚏𝚘𝚛𝚖𝚎𝚛𝚜

or via CDN, for example: https://v2.scrimba.com/s0lmm0qh1q

Segment Anything demo: webml-community/segment-anything-webgpu

5 replies

·

liked a model 6 months ago

DeepMount00/Llama-3.1-8b-ITA

Text Generation • Updated Oct 29, 2024 • 10.9k • 6

liked 2 models 7 months ago

nickprock/sentence-bert-base-italian-xxl-uncased

mistralai/Mathstral-7B-v0.1

Text Generation • Updated Jul 31, 2024 • 16.4k • 217

updated a model 7 months ago

Marvin73/multilingual-e5-large-GGUF

Updated Jul 10, 2024

reacted to mrm8488's post with ❤️ 7 months ago

Post

5153

🚨Exciting news for the Multilingual Synthetic Data Community!🚨

I’ve taken inspiration from the MAGPIE paper on Llama-3-8B-instruct and extended its capabilities. Here’s what’s new!

🗞 The MAGPIE paper showcased that if you use the instruction-tuned version (Llama-3-8B-instruct) to generate synthetic instructions and then fine-tune the base version (Llama-3-8B) on this dataset, you can improve even the it-tuned version

🤔 While reading a script by Sebastian Raschka, PhD, I wondered: Could these advancements be replicated in other languages? Specifically, could they benefit non-English datasets?

🎉 And the answer is YES! At least for Spanish. I've successfully adapted the techniques for Spanish, proving the model's flexibility and multilingual capabilities.

👩‍💻 To make this accessible, I created a basic script (heavily inspired by the Sebastian Raschka one) that allows you to generate similar datasets using ollama models (initially phi and llama3) automatically and upload it to the Hugging Face Hub!
[Script](https://gist.github.com/mrm8488/4650a5e3cc45523798a527a3446eb312)

🔍 Explore the datasets 📚 generated using our new script!

- [Llama-3-8B](https://huggingface.co/datasets/mrm8488/dataset_llama3_5000_samples_es_4231_filtered)
- [Phi-3-medium](https://huggingface.co/datasets/mrm8488/dataset_phi3-medium_5000_samples_es_3906_filtered)
- [Phi-3-mini](https://huggingface.co/datasets/mrm8488/dataset_phi3_5000_samples_es_3282_filtered)

Note: These datasets have basic filtering. Apply additional quality filters before using them to fine-tune large language models.

Inspiration and base script:
https://github.com/rasbt/LLMs-from-scratch/blob/main/ch07/05_dataset-generation/llama3-ollama.ipynb
https://www.linkedin.com/feed/update/urn:li:activity:7210982019751661568/

7 replies

·

reacted to anakin87's post with 🔥 8 months ago

Post

943

🧪 RAG Evaluation with 🔥 Prometheus 2 + Haystack

📝 Blog post: https://haystack.deepset.ai/blog/rag-evaluation-with-prometheus-2
📓 Notebook: https://github.com/deepset-ai/haystack-cookbook/blob/main/notebooks/prometheus2_evaluation.ipynb

─── ⋆⋅☆⋅⋆ ───

When evaluating LLMs' responses, 𝐩𝐫𝐨𝐩𝐫𝐢𝐞𝐭𝐚𝐫𝐲 𝐦𝐨𝐝𝐞𝐥𝐬 like GPT-4 are commonly used due to their strong performance.
However, relying on closed models presents challenges related to data privacy 🔒, transparency, controllability, and cost 💸.

On the other hand, 𝐨𝐩𝐞𝐧 𝐦𝐨𝐝𝐞𝐥𝐬 typically do not correlate well with human judgments and lack flexibility.

🔥 Prometheus 2 is a new family of open-source models designed to address these gaps:
🔹 two variants: prometheus-eval/prometheus-7b-v2.0; prometheus-eval/prometheus-8x7b-v2.0
🔹 trained on open-source data
🔹 high correlation with human evaluations and proprietary models
🔹 highly flexible: capable of performing direct assessments and pairwise rankings, and allowing the definition of custom evaluation criteria.

See my experiments with RAG evaluation in the links above.