I'm helping out on some community research to learn about the AI community. If you want to join in the conversation, head over here where I started a community discussion on the most influential model since BERT.
📣 Teachers and Students! Here's a handy quiz app if you're preparing your own study material.
TLDR, It's a quiz that uses a dataset to make questions and save answers
Here's how it works:
- make a dataset of multiple choice questions - duplicate the space add set the dataset repo - log in and do the quiz - submit the questions to create a new dataset
I made this to get ready for the agents course, but I hope it's useful for you projects too!
I just released Sentence Transformers v3.4.0, featuring a memory leak fix, compatibility between the powerful Cached... losses and the Matryoshka loss modifier, and a bunch of fixes & small features.
🪆 Matryoshka & Cached loss compatibility It is now possible to combine the powerful Cached... losses (which use in-batch negatives & a caching mechanism to allow for endless batch size & negatives) with the Matryoshka loss modifier which modifies a base loss such that it is trained not only on the maximum dimensionality (e.g. 1024 dimensions), but also on many lower dimensions (e.g. 768, 512, 256, 128, 64, 32). After training, these models' embeddings can be truncated for faster retrieval, etc.
🎞️ Resolve memory leak when Model and Trainer are reinitialized Due to a circular dependency between Trainer -> Model -> ModelCardData -> Trainer, deleting both the trainer & model still didn't free up the memory. This led to a memory leak in scripts where you repeatedly do so.
➕ New Features Many new small features, e.g. multi-GPU support for 'mine_hard_negatives', a 'margin' parameter to TripletEvaluator, and Matthews Correlation Coefficient in the BinaryClassificationEvaluator.
🐛 Bug Fixes Also a bunch of fixes, for example that subsequent batches were not sorted when using the "no_duplicates" batch sampler. See the release notes for more details.
We’re thrilled to share 𝗦𝗺𝗼𝗹𝗩𝗟𝗠 (256M & 500M)—the smallest Visual Language Models ever built. Think: running on <1GB of GPU memory—you can fine-tune it on your laptop and run it on your toaster!
Why It’s Game-Changing: - 𝗢𝘂𝘁𝗽𝗲𝗿𝗳𝗼𝗿𝗺𝘀 𝗟𝗮𝗿𝗴𝗲𝗿 𝗠𝗼𝗱𝗲𝗹𝘀: Even the 256M model surpasses our SOTA 80B-parameter model from just 17 months ago. Over 300x reduction! 𝗠𝗶𝗴𝗵𝘁𝘆 𝗘𝗳𝗳𝗶𝗰𝗶𝗲𝗻𝗰𝘆: The 256M version delivers 80% of our 2.2B model’s performance, and the 500M version hits 90% 𝗟𝗶𝗴𝗵𝘁𝗻𝗶𝗻𝗴-𝗙𝗮𝘀𝘁 𝗦𝗲𝗮𝗿𝗰𝗵: SmolVLM integrates with ColiPali for state-of-the-art retrieval speeds—on par with models 10x bigger. That means cheaper, faster indexing and real-world impact.
What’s New Under the Hood: - 𝗡𝗲𝘄 𝗩𝗶𝘀𝗶𝗼𝗻 𝗘𝗻𝗰𝗼𝗱𝗲𝗿: Smaller overall size (400M -> 93M), but with higher resolution. - 𝗛𝗶𝗴𝗵𝗲𝗿 𝗣𝗶𝘅𝗲𝗹𝘀/𝗧𝗼𝗸𝗲𝗻: 4096 vs. 1820—more efficient image processing. - 𝗦𝗺𝗮𝗿𝘁 𝗧𝗼𝗸𝗲𝗻𝗶𝘇𝗮𝘁𝗶𝗼𝗻: Faster training and a performance boost.
🆕 Hugging Face notification page is getting better! You can now sort your notifications by repositories, posts, papers, and articles. Let us know what you think 💬
You can now use the Synthetic Data Generator with your own domain-specific seed data to generate a dataset for fine-tuning retrieval or reranking models.