The Synthetic Data Generator now directly integrates with Argilla, so you can generate and curate your own high-quality datasets from pure natural language!
Up next -> include dataset generation for text classification. Other suggestions? Let us know.
You can now build a custom text classifier without days of human labeling!
π LLMs work reasonably well as text classifiers. π They are expensive to run at scale and their performance drops in specialized domains.
π Purpose-built classifiers have low latency and can potentially run on CPU. π They require labeled training data.
Combine the best of both worlds: the automatic labeling capabilities of LLMs and the high-quality annotations from human experts to train and deploy a specialized model.
Import any dataset from the Hub and configure your labeling tasks without needing any code!
Really excited about extending the Hugging Face Hub integration with many more streamlined features and workflows, and we would love to hear your feedback and ideas, so don't feel shy and reach out π«Άπ½
You can now build a custom text classifier without days of human labeling!
π LLMs work reasonably well as text classifiers. π They are expensive to run at scale and their performance drops in specialized domains.
π Purpose-built classifiers have low latency and can potentially run on CPU. π They require labeled training data.
Combine the best of both worlds: the automatic labeling capabilities of LLMs and the high-quality annotations from human experts to train and deploy a specialized model.
By far the coolest release of the day! > The Open LLM Leaderboard, most comprehensive suite for comparing Open LLMs on many benchmarks, just released a comparator tool that lets you dig into the detail of differences between any models.
Here's me checking how the new Llama-3.1-Nemotron-70B that we've heard so much compares to the original Llama-3.1-70B. π€π
The Synthetic Data Generator now directly integrates with Argilla, so you can generate and curate your own high-quality datasets from pure natural language!
Up next -> include dataset generation for text classification. Other suggestions? Let us know.
Thursday 10 October 17:00 CEST, I will show a good way to get started with a text classification project on the Hugging Face Hub with Argilla and Setfit.
Why is argilla/FinePersonas-v0.1 great for synthetic data generation? It can be used to synthesise realistic and diverse data of the customer personas your company is interested in!
Why is argilla/FinePersonas-v0.1 great for synthetic data generation? It can be used to synthesise realistic and diverse data of the customer personas your company is interested in!
We've got a number of great community meetups coming up again where we'll be discussing the basics of getting started and using Argilla for TextCat, TokenCat/NER and RAG. We will walk you through common scenario's and everything you might need to know to get your projects started.
First meetup that is coming up: Setting up a text classification project using Argilla and SetFit!
Deploy Argilla on Spaces Vibe check your dataset Configure and create an Argilla dataset Add records Add zero-shot suggestions Evaluate model suggestions in Argilla Train a SetFit model
Hope to see all of you guys there and looking forward to your questions and AI use cases. Don't be shy about bringing your own issues and questions to the table. We would love to answer them.
This is supercool!! LlaVA-3D: adds 3D-awareness to LVMs without compromising 2D understanding capabilities.
Method: they developed a unified architecture that maps 2D clip patch features to their corresponding positions in 3D space - enabling joint 2D and 3D vision-language instruction tuning.