smol-135-tq-augment

This model is a fine-tuned version of HuggingFaceTB/SmolLM2-135M on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1821
  • < Precision: 0.9430
  • < Recall: 0.9490
  • < F1-score: 0.9460
  • < Support: 4551.0
  • Precision: 0.9461

  • Recall: 0.9455

  • F1-score: 0.9458

  • Support: 4551.0

  • = Precision: 0.8177
  • = Recall: 0.7940
  • = F1-score: 0.8056
  • = Support: 898.0
    • Precision: 0.0
    • Recall: 0.0
    • F1-score: 0.0
    • Support: 0.0
  • Accuracy: 0.9335
  • Macro Avg Precision: 0.6767
  • Macro Avg Recall: 0.6721
  • Macro Avg F1-score: 0.6744
  • Macro Avg Support: 10000.0
  • Weighted Avg Precision: 0.9332
  • Weighted Avg Recall: 0.9335
  • Weighted Avg F1-score: 0.9333
  • Weighted Avg Support: 10000.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 512
  • total_eval_batch_size: 256
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: reduce_lr_on_plateau
  • num_epochs: 30

Training results

Training Loss Epoch Step Validation Loss < Precision < Recall < F1-score < Support > Precision > Recall > F1-score > Support = Precision = Recall = F1-score = Support - Precision - Recall - F1-score - Support Accuracy Macro Avg Precision Macro Avg Recall Macro Avg F1-score Macro Avg Support Weighted Avg Precision Weighted Avg Recall Weighted Avg F1-score Weighted Avg Support
0.6963 1.0 150 0.3731 0.6497 0.6799 0.6644 4551.0 0.6723 0.6504 0.6612 4551.0 0.5126 0.4766 0.4939 898.0 0.0 0.0 0.0 0.0 0.6482 0.4586 0.4517 0.4549 10000.0 0.6477 0.6482 0.6476 10000.0
0.5542 2.0 300 0.3164 0.7453 0.7020 0.7230 4551.0 0.7084 0.7739 0.7397 4551.0 0.7314 0.6036 0.6614 898.0 0.0 0.0 0.0 0.0 0.7259 0.5463 0.5199 0.5310 10000.0 0.7272 0.7259 0.7251 10000.0
0.4143 3.0 450 0.2629 0.8327 0.8062 0.8192 4551.0 0.8062 0.8446 0.8250 4551.0 0.7409 0.6815 0.7100 898.0 0.0 0.0 0.0 0.0 0.8125 0.5950 0.5831 0.5886 10000.0 0.8124 0.8125 0.8120 10000.0
0.2789 4.0 600 0.2197 0.8577 0.8943 0.8756 4551.0 0.8718 0.8772 0.8745 4551.0 0.8609 0.6481 0.7395 898.0 0.0 0.0 0.0 0.0 0.8644 0.6476 0.6049 0.6224 10000.0 0.8644 0.8644 0.8629 10000.0
0.2502 5.0 750 0.2087 0.9133 0.8890 0.9010 4551.0 0.8798 0.9229 0.9008 4551.0 0.8279 0.7339 0.7780 898.0 0.0 0.0 0.0 0.0 0.8905 0.6552 0.6364 0.6450 10000.0 0.8904 0.8905 0.8899 10000.0
0.2069 6.0 900 0.1898 0.9226 0.9011 0.9117 4551.0 0.8972 0.9303 0.9135 4551.0 0.8266 0.7695 0.7970 898.0 0.0 0.0 0.0 0.0 0.9026 0.6616 0.6502 0.6556 10000.0 0.9024 0.9026 0.9022 10000.0
0.2056 7.0 1050 0.1876 0.9204 0.9174 0.9189 4551.0 0.9118 0.9308 0.9212 4551.0 0.8301 0.7561 0.7914 898.0 0.0 0.0 0.0 0.0 0.909 0.6656 0.6511 0.6579 10000.0 0.9084 0.909 0.9085 10000.0
0.1686 8.0 1200 0.1837 0.9239 0.9336 0.9287 4551.0 0.9298 0.9286 0.9292 4551.0 0.8178 0.7795 0.7982 898.0 0.0 0.0 0.0 0.0 0.9175 0.6679 0.6604 0.6640 10000.0 0.9171 0.9175 0.9172 10000.0
0.158 9.0 1350 0.1822 0.9178 0.9402 0.9289 4551.0 0.9448 0.9178 0.9311 4551.0 0.7797 0.7962 0.7879 898.0 0.0 0.0 0.0 0.0 0.9171 0.6606 0.6636 0.6620 10000.0 0.9177 0.9171 0.9172 10000.0
0.1849 10.0 1500 0.1930 0.9227 0.9308 0.9267 4551.0 0.9255 0.9260 0.9257 4551.0 0.8026 0.7650 0.7834 898.0 0.0 0.0 0.0 0.0 0.9137 0.6627 0.6554 0.6590 10000.0 0.9132 0.9137 0.9134 10000.0
0.1407 11.0 1650 0.1726 0.9408 0.9459 0.9434 4551.0 0.9459 0.9383 0.9421 4551.0 0.8022 0.8129 0.8075 898.0 0.0 0.0 0.0 0.0 0.9305 0.6722 0.6743 0.6732 10000.0 0.9307 0.9305 0.9306 10000.0
0.1387 12.0 1800 0.1801 0.9404 0.9426 0.9415 4551.0 0.9414 0.9422 0.9418 4551.0 0.8075 0.7940 0.8007 898.0 0.0 0.0 0.0 0.0 0.9291 0.6723 0.6697 0.6710 10000.0 0.9289 0.9291 0.9290 10000.0
0.1359 13.0 1950 0.1780 0.9428 0.9411 0.9419 4551.0 0.9385 0.9455 0.9420 4551.0 0.8268 0.8029 0.8147 898.0 0.0 0.0 0.0 0.0 0.9307 0.6770 0.6724 0.6747 10000.0 0.9304 0.9307 0.9305 10000.0
0.1284 14.0 2100 0.1785 0.9445 0.9466 0.9456 4551.0 0.9452 0.9433 0.9442 4551.0 0.8004 0.7996 0.8 898.0 0.0 0.0 0.0 0.0 0.9319 0.6725 0.6724 0.6725 10000.0 0.9319 0.9319 0.9319 10000.0
0.1339 15.0 2250 0.1810 0.9474 0.9413 0.9443 4551.0 0.9406 0.9492 0.9449 4551.0 0.8124 0.8007 0.8065 898.0 0.0 0.0 0.0 0.0 0.9323 0.6751 0.6728 0.6739 10000.0 0.9322 0.9323 0.9322 10000.0
0.1294 16.0 2400 0.1821 0.9430 0.9490 0.9460 4551.0 0.9461 0.9455 0.9458 4551.0 0.8177 0.7940 0.8056 898.0 0.0 0.0 0.0 0.0 0.9335 0.6767 0.6721 0.6744 10000.0 0.9332 0.9335 0.9333 10000.0
0.1383 17.0 2550 0.1828 0.9453 0.9464 0.9459 4551.0 0.9443 0.9470 0.9457 4551.0 0.8125 0.7962 0.8043 898.0 0.0 0.0 0.0 0.0 0.9332 0.6755 0.6724 0.6740 10000.0 0.9330 0.9332 0.9331 10000.0
0.126 18.0 2700 0.1856 0.9418 0.9466 0.9442 4551.0 0.9426 0.9426 0.9426 4551.0 0.8149 0.7940 0.8043 898.0 0.0 0.0 0.0 0.0 0.9311 0.6748 0.6708 0.6728 10000.0 0.9308 0.9311 0.9309 10000.0
0.136 19.0 2850 0.1851 0.9459 0.9442 0.9450 4551.0 0.9415 0.9486 0.9451 4551.0 0.8200 0.7962 0.8079 898.0 0.0 0.0 0.0 0.0 0.9329 0.6768 0.6722 0.6745 10000.0 0.9326 0.9329 0.9327 10000.0

Framework versions

  • Transformers 4.47.1
  • Pytorch 2.5.1+cu124
  • Datasets 3.0.1
  • Tokenizers 0.21.0
Downloads last month
14
Safetensors
Model size
135M params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for hugosousa/smol-135-tq-augment

Finetuned
(263)
this model