SetFit with sentence-transformers/all-MiniLM-L6-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/all-MiniLM-L6-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
Word form transmission
  • "Mother should take care of her own child at first, by this quote we simply can see that problems of government's own country should be placed on the first position."
  • "A building's style may say a lot about its history."
  • 'A lot of artists and entertainment organisations have financional costs because of free using of their contents in the Internet.'
Tense semantics
  • 'Samsung, "Blackberry" and "HTC" in 2015 have almost the same percentage share.'
  • '(5,9%) Overall, almost all unemployment rates have remained on the same level between 2014 and 2015, except EU, Latin America and Middle East.'
  • '15% consist of things which are transported by rail in Eastern Europe in 2008.'
Synonyms
  • '(the destination between Moscow and Saint Petersburg, for instance, can be easily overcame by "Lastochka" train for 5 hours).'
  • '(the destination between Moscow and Saint Petersburg, for instance, can be easily overcame by "Lastochka" train for 5 hours).'
  • 'There is an extremely clear difference: there are too many men on a tech subjects.'
Copying expression
  • '15-59 years people in Yemen are increasing, while in Italy this number decreases.'
  • '2013 year is a key one.'
  • '3,6% are people have age 60+ years.'
Transliteration
  • 'A closer look at graphic revails that goods transported by rail had good products, which massive 11%.'
  • "According to first diagramm, half of Yemen's population in 2000 was children 0-14 years old."
  • 'According to my opinion different fabrics make much more harm for our nature.'

Evaluation

Metrics

Label Accuracy
all 0.6197

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("Zlovoblachko/L1-classifier")
# Run inference
preds = model("After 1980 part old people in USA rose slight and in Sweden this point stay unchanged.")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 2 21.005 47
Label Training Sample Count
Synonyms 99
Copying expression 26
Tense semantics 27
Word form transmission 40
Transliteration 8

Training Hyperparameters

  • batch_size: (32, 32)
  • num_epochs: (10, 10)
  • max_steps: -1
  • sampling_strategy: oversampling
  • body_learning_rate: (2e-05, 1e-05)
  • head_learning_rate: 0.01
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0012 1 0.3375 -
0.0590 50 0.3628 -
0.1179 100 0.3312 -
0.1769 150 0.2342 -
0.2358 200 0.2665 -
0.2948 250 0.1857 -
0.3538 300 0.2134 -
0.4127 350 0.1786 -
0.4717 400 0.092 -
0.5307 450 0.2031 -
0.5896 500 0.1449 -
0.6486 550 0.1234 -
0.7075 600 0.0552 -
0.7665 650 0.0693 -
0.8255 700 0.097 -
0.8844 750 0.0448 -
0.9434 800 0.041 -
1.0024 850 0.0431 -
1.0613 900 0.0227 -
1.1203 950 0.061 -
1.1792 1000 0.0209 -
1.2382 1050 0.0071 -
1.2972 1100 0.0285 -
1.3561 1150 0.0039 -
1.4151 1200 0.0029 -
1.4741 1250 0.0097 -
1.5330 1300 0.0076 -
1.5920 1350 0.0021 -
1.6509 1400 0.015 -
1.7099 1450 0.0027 -
1.7689 1500 0.0204 -
1.8278 1550 0.013 -
1.8868 1600 0.0222 -
1.9458 1650 0.0427 -
2.0047 1700 0.0181 -
2.0637 1750 0.0232 -
2.1226 1800 0.0053 -
2.1816 1850 0.0169 -
2.2406 1900 0.006 -
2.2995 1950 0.0108 -
2.3585 2000 0.0034 -
2.4175 2050 0.0198 -
2.4764 2100 0.0006 -
2.5354 2150 0.0142 -
2.5943 2200 0.0038 -
2.6533 2250 0.0006 -
2.7123 2300 0.0007 -
2.7712 2350 0.0012 -
2.8302 2400 0.0003 -
2.8892 2450 0.0127 -
2.9481 2500 0.0181 -
3.0071 2550 0.006 -
3.0660 2600 0.0006 -
3.125 2650 0.0156 -
3.1840 2700 0.0427 -
3.2429 2750 0.0004 -
3.3019 2800 0.0013 -
3.3608 2850 0.0241 -
3.4198 2900 0.0004 -
3.4788 2950 0.0048 -
3.5377 3000 0.0004 -
3.5967 3050 0.0006 -
3.6557 3100 0.0044 -
3.7146 3150 0.0142 -
3.7736 3200 0.005 -
3.8325 3250 0.0022 -
3.8915 3300 0.0033 -
3.9505 3350 0.0033 -
4.0094 3400 0.0005 -
4.0684 3450 0.0299 -
4.1274 3500 0.0172 -
4.1863 3550 0.0079 -
4.2453 3600 0.0012 -
4.3042 3650 0.0093 -
4.3632 3700 0.0175 -
4.4222 3750 0.0278 -
4.4811 3800 0.0004 -
4.5401 3850 0.0054 -
4.5991 3900 0.002 -
4.6580 3950 0.0248 -
4.7170 4000 0.0173 -
4.7759 4050 0.0004 -
4.8349 4100 0.0154 -
4.8939 4150 0.0162 -
4.9528 4200 0.0052 -
5.0118 4250 0.0142 -
5.0708 4300 0.0109 -
5.1297 4350 0.0003 -
5.1887 4400 0.0002 -
5.2476 4450 0.0003 -
5.3066 4500 0.0081 -
5.3656 4550 0.0005 -
5.4245 4600 0.0229 -
5.4835 4650 0.0002 -
5.5425 4700 0.0004 -
5.6014 4750 0.0233 -
5.6604 4800 0.0086 -
5.7193 4850 0.0084 -
5.7783 4900 0.0177 -
5.8373 4950 0.0102 -
5.8962 5000 0.017 -
5.9552 5050 0.0037 -
6.0142 5100 0.005 -
6.0731 5150 0.0002 -
6.1321 5200 0.0188 -
6.1910 5250 0.0037 -
6.25 5300 0.0003 -
6.3090 5350 0.0137 -
6.3679 5400 0.0107 -
6.4269 5450 0.0045 -
6.4858 5500 0.0002 -
6.5448 5550 0.0238 -
6.6038 5600 0.0209 -
6.6627 5650 0.0003 -
6.7217 5700 0.0002 -
6.7807 5750 0.0029 -
6.8396 5800 0.0177 -
6.8986 5850 0.0165 -
6.9575 5900 0.0045 -
7.0165 5950 0.0203 -
7.0755 6000 0.0048 -
7.1344 6050 0.0251 -
7.1934 6100 0.0147 -
7.2524 6150 0.0033 -
7.3113 6200 0.0166 -
7.3703 6250 0.0129 -
7.4292 6300 0.0169 -
7.4882 6350 0.0001 -
7.5472 6400 0.0002 -
7.6061 6450 0.0029 -
7.6651 6500 0.0264 -
7.7241 6550 0.0079 -
7.7830 6600 0.0002 -
7.8420 6650 0.0157 -
7.9009 6700 0.0116 -
7.9599 6750 0.0031 -
8.0189 6800 0.0055 -
8.0778 6850 0.0113 -
8.1368 6900 0.0004 -
8.1958 6950 0.0301 -
8.2547 7000 0.0002 -
8.3137 7050 0.0169 -
8.3726 7100 0.0001 -
8.4316 7150 0.0165 -
8.4906 7200 0.0201 -
8.5495 7250 0.0168 -
8.6085 7300 0.0197 -
8.6675 7350 0.0165 -
8.7264 7400 0.0165 -
8.7854 7450 0.0002 -
8.8443 7500 0.0134 -
8.9033 7550 0.0037 -
8.9623 7600 0.0043 -
9.0212 7650 0.0001 -
9.0802 7700 0.0034 -
9.1392 7750 0.0036 -
9.1981 7800 0.0001 -
9.2571 7850 0.0069 -
9.3160 7900 0.0304 -
9.375 7950 0.0203 -
9.4340 8000 0.0002 -
9.4929 8050 0.0002 -
9.5519 8100 0.0058 -
9.6108 8150 0.0141 -
9.6698 8200 0.0031 -
9.7288 8250 0.0169 -
9.7877 8300 0.0002 -
9.8467 8350 0.0075 -
9.9057 8400 0.0192 -
9.9646 8450 0.0588 -

Framework Versions

  • Python: 3.10.12
  • SetFit: 1.1.0.dev0
  • Sentence Transformers: 2.6.1
  • Transformers: 4.38.2
  • PyTorch: 2.2.1+cu121
  • Datasets: 2.18.0
  • Tokenizers: 0.15.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
2
Safetensors
Model size
22.7M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Zlovoblachko/L1-classifier

Finetuned
(211)
this model

Evaluation results