Commit
·
08b43db
1
Parent(s):
d75924a
update title
Browse files
examples/fine-tune-modernbert-classifier.ipynb
CHANGED
@@ -4,13 +4,12 @@
|
|
4 |
"cell_type": "markdown",
|
5 |
"metadata": {},
|
6 |
"source": [
|
7 |
-
"# Fine
|
8 |
"\n",
|
9 |
"LLMs are great general purpose models, but they are not always the best choice for a specific task. Therefore, smaller and more specialized models are important for sustainable, efficient, and cheaper AI.\n",
|
|
|
10 |
"\n",
|
11 |
-
"
|
12 |
-
"\n",
|
13 |
-
"In this example, we will finetune a ModernBERT model on a synthetic dataset generated from the synthetic-data-generator. Showing the effectiveness of synthetic data and the novel ModernBERT model, which is new and improved version of BERT models, with 8192 token context length, significantly better downstream performance, and much faster processing speeds.\n",
|
14 |
"\n",
|
15 |
"## Install the dependencies"
|
16 |
]
|
|
|
4 |
"cell_type": "markdown",
|
5 |
"metadata": {},
|
6 |
"source": [
|
7 |
+
"# Fine-tune ModernBERT for text classification using synthetic data\n",
|
8 |
"\n",
|
9 |
"LLMs are great general purpose models, but they are not always the best choice for a specific task. Therefore, smaller and more specialized models are important for sustainable, efficient, and cheaper AI.\n",
|
10 |
+
"A lack of domain sepcific datasets is a common problem for smaller and more specialized models. This is because it is difficult to find a dataset that is both representative and diverse enough for a specific task. We solve this problem by generating a synthetic dataset from an LLM using the `synthetic-data-generator`, which is available as a [Hugging Face Space](https://huggingface.co/spaces/argilla/synthetic-data-generator) or on [GitHub](https://github.com/argilla-io/synthetic-data-generator).\n",
|
11 |
"\n",
|
12 |
+
"In this example, we will fine-tune a ModernBERT model on a synthetic dataset generated from the synthetic-data-generator. This demonstrates the effectiveness of synthetic data and the novel ModernBERT model, which is a new and improved version of BERT models, with an 8192 token context length, significantly better downstream performance, and much faster processing speeds.\n",
|
|
|
|
|
13 |
"\n",
|
14 |
"## Install the dependencies"
|
15 |
]
|