smol-135-tq-augment

This model is a fine-tuned version of HuggingFaceTB/SmolLM2-135M on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.1821
< Precision: 0.9430
< Recall: 0.9490
< F1-score: 0.9460
< Support: 4551.0
Precision: 0.9461
Recall: 0.9455
F1-score: 0.9458
Support: 4551.0
= Precision: 0.8177
= Recall: 0.7940
= F1-score: 0.8056
= Support: 898.0
- Precision: 0.0
- Recall: 0.0
- F1-score: 0.0
- Support: 0.0
Accuracy: 0.9335
Macro Avg Precision: 0.6767
Macro Avg Recall: 0.6721
Macro Avg F1-score: 0.6744
Macro Avg Support: 10000.0
Weighted Avg Precision: 0.9332
Weighted Avg Recall: 0.9335
Weighted Avg F1-score: 0.9333
Weighted Avg Support: 10000.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 64
eval_batch_size: 64
seed: 42
distributed_type: multi-GPU
num_devices: 4
gradient_accumulation_steps: 2
total_train_batch_size: 512
total_eval_batch_size: 256
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: reduce_lr_on_plateau
num_epochs: 30

Training results

Training Loss	Epoch	Step	Validation Loss	< Precision	< Recall	< F1-score	< Support	> Precision	> Recall	> F1-score	> Support	= Precision	= Recall	= F1-score	= Support	Accuracy	Macro Avg Precision	Macro Avg Recall	Macro Avg F1-score	Macro Avg Support	Weighted Avg Precision	Weighted Avg Recall	Weighted Avg F1-score	Weighted Avg Support
0.6963	1.0	150	0.3731	0.6497	0.6799	0.6644	4551.0	0.6723	0.6504	0.6612	4551.0	0.5126	0.4766	0.4939	898.0	0.6482	0.4586	0.4517	0.4549	10000.0	0.6477	0.6482	0.6476	10000.0
0.5542	2.0	300	0.3164	0.7453	0.7020	0.7230	4551.0	0.7084	0.7739	0.7397	4551.0	0.7314	0.6036	0.6614	898.0	0.7259	0.5463	0.5199	0.5310	10000.0	0.7272	0.7259	0.7251	10000.0
0.4143	3.0	450	0.2629	0.8327	0.8062	0.8192	4551.0	0.8062	0.8446	0.8250	4551.0	0.7409	0.6815	0.7100	898.0	0.8125	0.5950	0.5831	0.5886	10000.0	0.8124	0.8125	0.8120	10000.0
0.2789	4.0	600	0.2197	0.8577	0.8943	0.8756	4551.0	0.8718	0.8772	0.8745	4551.0	0.8609	0.6481	0.7395	898.0	0.8644	0.6476	0.6049	0.6224	10000.0	0.8644	0.8644	0.8629	10000.0
0.2502	5.0	750	0.2087	0.9133	0.8890	0.9010	4551.0	0.8798	0.9229	0.9008	4551.0	0.8279	0.7339	0.7780	898.0	0.8905	0.6552	0.6364	0.6450	10000.0	0.8904	0.8905	0.8899	10000.0
0.2069	6.0	900	0.1898	0.9226	0.9011	0.9117	4551.0	0.8972	0.9303	0.9135	4551.0	0.8266	0.7695	0.7970	898.0	0.9026	0.6616	0.6502	0.6556	10000.0	0.9024	0.9026	0.9022	10000.0
0.2056	7.0	1050	0.1876	0.9204	0.9174	0.9189	4551.0	0.9118	0.9308	0.9212	4551.0	0.8301	0.7561	0.7914	898.0	0.909	0.6656	0.6511	0.6579	10000.0	0.9084	0.909	0.9085	10000.0
0.1686	8.0	1200	0.1837	0.9239	0.9336	0.9287	4551.0	0.9298	0.9286	0.9292	4551.0	0.8178	0.7795	0.7982	898.0	0.9175	0.6679	0.6604	0.6640	10000.0	0.9171	0.9175	0.9172	10000.0
0.158	9.0	1350	0.1822	0.9178	0.9402	0.9289	4551.0	0.9448	0.9178	0.9311	4551.0	0.7797	0.7962	0.7879	898.0	0.9171	0.6606	0.6636	0.6620	10000.0	0.9177	0.9171	0.9172	10000.0
0.1849	10.0	1500	0.1930	0.9227	0.9308	0.9267	4551.0	0.9255	0.9260	0.9257	4551.0	0.8026	0.7650	0.7834	898.0	0.9137	0.6627	0.6554	0.6590	10000.0	0.9132	0.9137	0.9134	10000.0
0.1407	11.0	1650	0.1726	0.9408	0.9459	0.9434	4551.0	0.9459	0.9383	0.9421	4551.0	0.8022	0.8129	0.8075	898.0	0.9305	0.6722	0.6743	0.6732	10000.0	0.9307	0.9305	0.9306	10000.0
0.1387	12.0	1800	0.1801	0.9404	0.9426	0.9415	4551.0	0.9414	0.9422	0.9418	4551.0	0.8075	0.7940	0.8007	898.0	0.9291	0.6723	0.6697	0.6710	10000.0	0.9289	0.9291	0.9290	10000.0
0.1359	13.0	1950	0.1780	0.9428	0.9411	0.9419	4551.0	0.9385	0.9455	0.9420	4551.0	0.8268	0.8029	0.8147	898.0	0.9307	0.6770	0.6724	0.6747	10000.0	0.9304	0.9307	0.9305	10000.0
0.1284	14.0	2100	0.1785	0.9445	0.9466	0.9456	4551.0	0.9452	0.9433	0.9442	4551.0	0.8004	0.7996	0.8	898.0	0.9319	0.6725	0.6724	0.6725	10000.0	0.9319	0.9319	0.9319	10000.0
0.1339	15.0	2250	0.1810	0.9474	0.9413	0.9443	4551.0	0.9406	0.9492	0.9449	4551.0	0.8124	0.8007	0.8065	898.0	0.9323	0.6751	0.6728	0.6739	10000.0	0.9322	0.9323	0.9322	10000.0
0.1294	16.0	2400	0.1821	0.9430	0.9490	0.9460	4551.0	0.9461	0.9455	0.9458	4551.0	0.8177	0.7940	0.8056	898.0	0.9335	0.6767	0.6721	0.6744	10000.0	0.9332	0.9335	0.9333	10000.0
0.1383	17.0	2550	0.1828	0.9453	0.9464	0.9459	4551.0	0.9443	0.9470	0.9457	4551.0	0.8125	0.7962	0.8043	898.0	0.9332	0.6755	0.6724	0.6740	10000.0	0.9330	0.9332	0.9331	10000.0
0.126	18.0	2700	0.1856	0.9418	0.9466	0.9442	4551.0	0.9426	0.9426	0.9426	4551.0	0.8149	0.7940	0.8043	898.0	0.9311	0.6748	0.6708	0.6728	10000.0	0.9308	0.9311	0.9309	10000.0
0.136	19.0	2850	0.1851	0.9459	0.9442	0.9450	4551.0	0.9415	0.9486	0.9451	4551.0	0.8200	0.7962	0.8079	898.0	0.9329	0.6768	0.6722	0.6745	10000.0	0.9326	0.9329	0.9327	10000.0

Framework versions

Transformers 4.47.1
Pytorch 2.5.1+cu124
Datasets 3.0.1
Tokenizers 0.21.0

hugosousa
/

smol-135-tq-augment

smol-135-tq-augment

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for hugosousa/smol-135-tq-augment

Evaluation results