smol-135-tq-closure-synthetic

This model is a fine-tuned version of HuggingFaceTB/SmolLM2-135M on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2292
  • < Precision: 0.8939
  • < Recall: 0.8974
  • < F1-score: 0.8956
  • < Support: 4036.0
  • Precision: 0.9419

  • Recall: 0.9383

  • F1-score: 0.9401

  • Support: 3681.0

  • = Precision: 0.7952
  • = Recall: 0.8187
  • = F1-score: 0.8068
  • = Support: 1622.0
    • Precision: 0.7872
    • Recall: 0.7277
    • F1-score: 0.7563
    • Support: 661.0
  • Accuracy: 0.8885
  • Macro Avg Precision: 0.8546
  • Macro Avg Recall: 0.8455
  • Macro Avg F1-score: 0.8497
  • Macro Avg Support: 10000.0
  • Weighted Avg Precision: 0.8885
  • Weighted Avg Recall: 0.8885
  • Weighted Avg F1-score: 0.8884
  • Weighted Avg Support: 10000.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 512
  • total_eval_batch_size: 256
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: reduce_lr_on_plateau
  • num_epochs: 30

Training results

Training Loss Epoch Step Validation Loss < Precision < Recall < F1-score < Support > Precision > Recall > F1-score > Support = Precision = Recall = F1-score = Support - Precision - Recall - F1-score - Support Accuracy Macro Avg Precision Macro Avg Recall Macro Avg F1-score Macro Avg Support Weighted Avg Precision Weighted Avg Recall Weighted Avg F1-score Weighted Avg Support
0.4785 1.0 1354 0.2004 0.8789 0.8665 0.8726 4036.0 0.8989 0.9226 0.9106 3681.0 0.7986 0.7700 0.7841 1622.0 0.7467 0.7670 0.7567 661.0 0.8649 0.8308 0.8315 0.8310 10000.0 0.8645 0.8649 0.8646 10000.0
0.3428 2.0 2708 0.1891 0.8766 0.8959 0.8862 4036.0 0.9215 0.9313 0.9264 3681.0 0.8175 0.7762 0.7963 1622.0 0.7967 0.7413 0.7680 661.0 0.8793 0.8531 0.8362 0.8442 10000.0 0.8783 0.8793 0.8786 10000.0
0.3423 3.0 4062 0.1858 0.9031 0.8749 0.8887 4036.0 0.9234 0.9337 0.9285 3681.0 0.7783 0.8224 0.7998 1622.0 0.7645 0.7564 0.7605 661.0 0.8802 0.8423 0.8469 0.8444 10000.0 0.8812 0.8802 0.8805 10000.0
0.3044 4.0 5416 0.1870 0.8875 0.8853 0.8864 4036.0 0.9168 0.9397 0.9281 3681.0 0.8172 0.7965 0.8067 1622.0 0.7710 0.7231 0.7463 661.0 0.8802 0.8481 0.8362 0.8419 10000.0 0.8792 0.8802 0.8796 10000.0
0.2814 5.0 6770 0.1857 0.8880 0.8902 0.8891 4036.0 0.9335 0.9348 0.9342 3681.0 0.8006 0.8095 0.8050 1622.0 0.7818 0.7428 0.7618 661.0 0.8838 0.8510 0.8443 0.8475 10000.0 0.8836 0.8838 0.8837 10000.0
0.2861 6.0 8124 0.1893 0.8928 0.8977 0.8952 4036.0 0.9361 0.9315 0.9338 3681.0 0.7930 0.8150 0.8039 1622.0 0.7810 0.7231 0.7510 661.0 0.8852 0.8508 0.8419 0.8460 10000.0 0.8852 0.8852 0.8851 10000.0
0.3019 7.0 9478 0.1982 0.8693 0.9093 0.8888 4036.0 0.9314 0.9326 0.9320 3681.0 0.8340 0.7620 0.7964 1622.0 0.7787 0.7186 0.7474 661.0 0.8814 0.8533 0.8306 0.8412 10000.0 0.8804 0.8814 0.8804 10000.0
0.2531 8.0 10832 0.2028 0.8955 0.8858 0.8906 4036.0 0.9245 0.9416 0.9330 3681.0 0.8103 0.7873 0.7986 1622.0 0.7438 0.7685 0.7560 661.0 0.8826 0.8435 0.8458 0.8445 10000.0 0.8823 0.8826 0.8824 10000.0
0.1957 9.0 12186 0.2206 0.8882 0.9036 0.8958 4036.0 0.9264 0.9478 0.9370 3681.0 0.8272 0.7824 0.8042 1622.0 0.7896 0.7095 0.7474 661.0 0.8874 0.8579 0.8358 0.8461 10000.0 0.8859 0.8874 0.8863 10000.0
0.1648 10.0 13540 0.2256 0.8905 0.8902 0.8903 4036.0 0.9290 0.9421 0.9355 3681.0 0.8080 0.7885 0.7981 1622.0 0.7596 0.7458 0.7527 661.0 0.8833 0.8468 0.8417 0.8442 10000.0 0.8826 0.8833 0.8829 10000.0
0.1691 11.0 14894 0.2292 0.8939 0.8974 0.8956 4036.0 0.9419 0.9383 0.9401 3681.0 0.7952 0.8187 0.8068 1622.0 0.7872 0.7277 0.7563 661.0 0.8885 0.8546 0.8455 0.8497 10000.0 0.8885 0.8885 0.8884 10000.0
0.1592 12.0 16248 0.2357 0.8904 0.8895 0.8899 4036.0 0.9351 0.9348 0.9349 3681.0 0.7989 0.8033 0.8011 1622.0 0.7382 0.7337 0.7360 661.0 0.8819 0.8406 0.8403 0.8405 10000.0 0.8819 0.8819 0.8819 10000.0
0.1783 13.0 17602 0.2415 0.8879 0.8972 0.8925 4036.0 0.9355 0.9378 0.9366 3681.0 0.8083 0.7928 0.8005 1622.0 0.7598 0.7368 0.7481 661.0 0.8846 0.8479 0.8411 0.8444 10000.0 0.8841 0.8846 0.8843 10000.0
0.1395 14.0 18956 0.2466 0.8951 0.8905 0.8928 4036.0 0.9340 0.9343 0.9341 3681.0 0.8029 0.8064 0.8047 1622.0 0.7374 0.7519 0.7446 661.0 0.8838 0.8424 0.8458 0.8440 10000.0 0.8841 0.8838 0.8839 10000.0

Framework versions

  • Transformers 4.47.1
  • Pytorch 2.5.1+cu124
  • Datasets 3.0.1
  • Tokenizers 0.21.0
Downloads last month
15
Safetensors
Model size
135M params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for hugosousa/smol-135-tq-closure-synthetic

Finetuned
(351)
this model