Training with Prompts Collection See the Training with Prompts documentation for more details: https://sbert.net/examples/training/prompts/README.html • 5 items • Updated 5 days ago • 1
view article Article Releasing Common Corpus: the largest public domain dataset for training LLMs By Pclanglais • Mar 20 • 17
Model2Vec base models Collection These are the Minishlab Model2Vec base models. Load them and use them with model2vec (https://github.com/MinishLab/model2vec) or sentence-transformers • 7 items • Updated 14 days ago • 8
POTION Collection These are the flagship POTION models. Load them and use them with model2vec (https://github.com/MinishLab/model2vec) or sentence-transformers • 3 items • Updated 14 days ago • 6
view article Article Releasing Outlines-core 0.1.0: structured generation in Rust and Python 22 days ago • 39
view article Article Transformers.js v3: WebGPU support, new models & tasks, and more… 22 days ago • 58
AutoTrain: No-code training for state-of-the-art models Paper • 2410.15735 • Published 23 days ago • 56
Granite 3.0 Language Models Collection A series of language models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 8 items • Updated 8 days ago • 86
MedEmbed: Embedding Models for Medical Domain Collection GitHub -> https://github.com/abhinand5/MedEmbed • 4 items • Updated 22 days ago • 7
view article Article MedEmbed: Fine-Tuned Embedding Models for Medical / Clinical IR By abhinand • 23 days ago • 30
Starbucks: Improved Training for 2D Matryoshka Embeddings Paper • 2410.13230 • Published 27 days ago • 9
view article Article How to build a custom text classifier without days of human labeling By sdiazlor • 26 days ago • 54
view article Article How to optimize your data labelling project with custom interfaces By burtenshaw • 28 days ago • 18