smol-135-tq-false

This model is a fine-tuned version of HuggingFaceTB/SmolLM2-135M on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0587
  • < Precision: 0.9835
  • < Recall: 0.9892
  • < F1-score: 0.9863
  • < Support: 2588.0
  • Precision: 0.9876

  • Recall: 0.9846

  • F1-score: 0.9861

  • Support: 2268.0

  • = Precision: 0.8779
  • = Recall: 0.8647
  • = F1-score: 0.8712
  • = Support: 133.0
    • Precision: 0.2
    • Recall: 0.0909
    • F1-score: 0.125
    • Support: 11.0
  • Accuracy: 0.9818
  • Macro Avg Precision: 0.7622
  • Macro Avg Recall: 0.7323
  • Macro Avg F1-score: 0.7422
  • Macro Avg Support: 5000.0
  • Weighted Avg Precision: 0.9808
  • Weighted Avg Recall: 0.9818
  • Weighted Avg F1-score: 0.9813
  • Weighted Avg Support: 5000.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 512
  • total_eval_batch_size: 128
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: reduce_lr_on_plateau
  • num_epochs: 30

Training results

Training Loss Epoch Step Validation Loss < Precision < Recall < F1-score < Support > Precision > Recall > F1-score > Support = Precision = Recall = F1-score = Support - Precision - Recall - F1-score - Support Accuracy Macro Avg Precision Macro Avg Recall Macro Avg F1-score Macro Avg Support Weighted Avg Precision Weighted Avg Recall Weighted Avg F1-score Weighted Avg Support
0.4944 1.0 443 0.1333 0.8957 0.9359 0.9153 2588.0 0.9071 0.8955 0.9013 2268.0 0.5965 0.2556 0.3579 133.0 0.0 0.0 0.0 11.0 0.8974 0.5998 0.5217 0.5436 5000.0 0.8909 0.8974 0.8921 5000.0
0.1949 2.0 886 0.0753 0.9411 0.9687 0.9547 2588.0 0.9684 0.9321 0.9499 2268.0 0.5686 0.6541 0.6084 133.0 0.0 0.0 0.0 11.0 0.9416 0.6195 0.6387 0.6282 5000.0 0.9415 0.9416 0.9412 5000.0
0.138 3.0 1329 0.0536 0.9492 0.9826 0.9656 2588.0 0.972 0.9643 0.9681 2268.0 0.8732 0.4662 0.6078 133.0 0.0 0.0 0.0 11.0 0.9584 0.6986 0.6033 0.6354 5000.0 0.9555 0.9584 0.9551 5000.0
0.0787 4.0 1772 0.0360 0.9781 0.9845 0.9813 2588.0 0.9798 0.9824 0.9811 2268.0 0.8291 0.7293 0.776 133.0 0.0 0.0 0.0 11.0 0.9746 0.6967 0.6741 0.6846 5000.0 0.9728 0.9746 0.9736 5000.0
0.0449 5.0 2215 0.0399 0.9826 0.9791 0.9808 2588.0 0.9832 0.9780 0.9805 2268.0 0.7368 0.8421 0.7860 133.0 0.2308 0.2727 0.25 11.0 0.9734 0.7333 0.7680 0.7493 5000.0 0.9746 0.9734 0.9739 5000.0
0.0143 6.0 2658 0.0362 0.9753 0.9915 0.9833 2588.0 0.9888 0.9762 0.9825 2268.0 0.8699 0.8045 0.8359 133.0 0.1429 0.0909 0.1111 11.0 0.9776 0.7442 0.7158 0.7282 5000.0 0.9768 0.9776 0.9771 5000.0
0.008 7.0 3101 0.0400 0.9819 0.9849 0.9834 2588.0 0.9820 0.9850 0.9835 2268.0 0.8468 0.7895 0.8171 133.0 0.0 0.0 0.0 11.0 0.9776 0.7027 0.6899 0.6960 5000.0 0.9762 0.9776 0.9769 5000.0
0.0052 8.0 3544 0.0447 0.9816 0.9880 0.9848 2588.0 0.9846 0.9863 0.9855 2268.0 0.8917 0.8045 0.8458 133.0 0.0 0.0 0.0 11.0 0.9802 0.7145 0.6947 0.7040 5000.0 0.9784 0.9802 0.9792 5000.0
0.0003 9.0 3987 0.0488 0.9816 0.9884 0.9850 2588.0 0.9876 0.9850 0.9863 2268.0 0.8682 0.8421 0.8550 133.0 0.0 0.0 0.0 11.0 0.9808 0.7094 0.7039 0.7066 5000.0 0.9791 0.9808 0.9800 5000.0
0.0002 10.0 4430 0.0500 0.9846 0.9876 0.9861 2588.0 0.9863 0.9859 0.9861 2268.0 0.8769 0.8571 0.8669 133.0 0.1429 0.0909 0.1111 11.0 0.9814 0.7477 0.7304 0.7376 5000.0 0.9807 0.9814 0.9810 5000.0
0.0002 11.0 4873 0.0587 0.9835 0.9892 0.9863 2588.0 0.9876 0.9846 0.9861 2268.0 0.8779 0.8647 0.8712 133.0 0.2 0.0909 0.125 11.0 0.9818 0.7622 0.7323 0.7422 5000.0 0.9808 0.9818 0.9813 5000.0
0.0 12.0 5316 0.0613 0.9835 0.9888 0.9861 2588.0 0.9859 0.9859 0.9859 2268.0 0.896 0.8421 0.8682 133.0 0.2 0.0909 0.125 11.0 0.9816 0.7663 0.7269 0.7413 5000.0 0.9805 0.9816 0.9810 5000.0
0.0 13.0 5759 0.0634 0.9835 0.9884 0.9859 2588.0 0.9868 0.9863 0.9865 2268.0 0.8898 0.8496 0.8692 133.0 0.2 0.0909 0.125 11.0 0.9818 0.7650 0.7288 0.7417 5000.0 0.9807 0.9818 0.9812 5000.0
0.0 14.0 6202 0.0647 0.9835 0.9884 0.9859 2588.0 0.9868 0.9863 0.9865 2268.0 0.8898 0.8496 0.8692 133.0 0.2 0.0909 0.125 11.0 0.9818 0.7650 0.7288 0.7417 5000.0 0.9807 0.9818 0.9812 5000.0
0.0 15.0 6645 0.0663 0.9835 0.9884 0.9859 2588.0 0.9868 0.9863 0.9865 2268.0 0.8898 0.8496 0.8692 133.0 0.2 0.0909 0.125 11.0 0.9818 0.7650 0.7288 0.7417 5000.0 0.9807 0.9818 0.9812 5000.0
0.0 16.0 7088 0.0671 0.9838 0.9880 0.9859 2588.0 0.9868 0.9859 0.9863 2268.0 0.8769 0.8571 0.8669 133.0 0.2 0.0909 0.125 11.0 0.9816 0.7619 0.7305 0.7410 5000.0 0.9806 0.9816 0.9810 5000.0
0.0 17.0 7531 0.0680 0.9838 0.9880 0.9859 2588.0 0.9868 0.9863 0.9865 2268.0 0.8837 0.8571 0.8702 133.0 0.2 0.0909 0.125 11.0 0.9818 0.7636 0.7306 0.7419 5000.0 0.9808 0.9818 0.9812 5000.0
0.0 18.0 7974 0.0689 0.9838 0.9880 0.9859 2588.0 0.9868 0.9859 0.9863 2268.0 0.8769 0.8571 0.8669 133.0 0.2 0.0909 0.125 11.0 0.9816 0.7619 0.7305 0.7410 5000.0 0.9806 0.9816 0.9810 5000.0
0.0029 19.0 8417 0.0696 0.9838 0.9876 0.9857 2588.0 0.9868 0.9859 0.9863 2268.0 0.8702 0.8571 0.8636 133.0 0.2 0.0909 0.125 11.0 0.9814 0.7602 0.7304 0.7402 5000.0 0.9804 0.9814 0.9809 5000.0
0.0 20.0 8860 0.0699 0.9838 0.9876 0.9857 2588.0 0.9868 0.9863 0.9865 2268.0 0.8769 0.8571 0.8669 133.0 0.2 0.0909 0.125 11.0 0.9816 0.7619 0.7305 0.7411 5000.0 0.9806 0.9816 0.9810 5000.0
0.0 21.0 9303 0.0704 0.9835 0.9876 0.9855 2588.0 0.9868 0.9859 0.9863 2268.0 0.8692 0.8496 0.8593 133.0 0.2 0.0909 0.125 11.0 0.9812 0.7599 0.7285 0.7390 5000.0 0.9802 0.9812 0.9806 5000.0
0.0 22.0 9746 0.0707 0.9838 0.9880 0.9859 2588.0 0.9868 0.9859 0.9863 2268.0 0.8769 0.8571 0.8669 133.0 0.2 0.0909 0.125 11.0 0.9816 0.7619 0.7305 0.7410 5000.0 0.9806 0.9816 0.9810 5000.0
0.0 23.0 10189 0.0711 0.9835 0.9876 0.9855 2588.0 0.9868 0.9854 0.9861 2268.0 0.8702 0.8571 0.8636 133.0 0.2 0.0909 0.125 11.0 0.9812 0.7601 0.7303 0.7401 5000.0 0.9802 0.9812 0.9807 5000.0
0.0 24.0 10632 0.0709 0.9838 0.9876 0.9857 2588.0 0.9868 0.9859 0.9863 2268.0 0.8702 0.8571 0.8636 133.0 0.2 0.0909 0.125 11.0 0.9814 0.7602 0.7304 0.7402 5000.0 0.9804 0.9814 0.9809 5000.0
0.0 25.0 11075 0.0715 0.9838 0.9876 0.9857 2588.0 0.9868 0.9859 0.9863 2268.0 0.8702 0.8571 0.8636 133.0 0.2 0.0909 0.125 11.0 0.9814 0.7602 0.7304 0.7402 5000.0 0.9804 0.9814 0.9809 5000.0
0.0 26.0 11518 0.0714 0.9835 0.9876 0.9855 2588.0 0.9868 0.9854 0.9861 2268.0 0.8702 0.8571 0.8636 133.0 0.2 0.0909 0.125 11.0 0.9812 0.7601 0.7303 0.7401 5000.0 0.9802 0.9812 0.9807 5000.0
0.0 27.0 11961 0.0716 0.9838 0.9876 0.9857 2588.0 0.9868 0.9859 0.9863 2268.0 0.8702 0.8571 0.8636 133.0 0.2 0.0909 0.125 11.0 0.9814 0.7602 0.7304 0.7402 5000.0 0.9804 0.9814 0.9809 5000.0
0.0 28.0 12404 0.0715 0.9838 0.9876 0.9857 2588.0 0.9868 0.9859 0.9863 2268.0 0.8702 0.8571 0.8636 133.0 0.2 0.0909 0.125 11.0 0.9814 0.7602 0.7304 0.7402 5000.0 0.9804 0.9814 0.9809 5000.0
0.0 29.0 12847 0.0717 0.9838 0.9880 0.9859 2588.0 0.9868 0.9859 0.9863 2268.0 0.8769 0.8571 0.8669 133.0 0.2 0.0909 0.125 11.0 0.9816 0.7619 0.7305 0.7410 5000.0 0.9806 0.9816 0.9810 5000.0
0.0 29.9328 13260 0.0718 0.9838 0.9876 0.9857 2588.0 0.9868 0.9859 0.9863 2268.0 0.8702 0.8571 0.8636 133.0 0.2 0.0909 0.125 11.0 0.9814 0.7602 0.7304 0.7402 5000.0 0.9804 0.9814 0.9809 5000.0

Framework versions

  • Transformers 4.47.1
  • Pytorch 2.5.1+cu124
  • Datasets 3.0.1
  • Tokenizers 0.21.0
Downloads last month
4
Safetensors
Model size
135M params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for hugosousa/smol-135-tq-false

Finetuned
(263)
this model