ST1_modernbert-base_product-category_V1
This model is a fine-tuned version of answerdotai/ModernBERT-base on the None dataset. It achieves the following results on the evaluation set:
- Loss: 3.1500
- F1: 0.7538
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 36
- eval_batch_size: 16
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 200
Training results
Training Loss | Epoch | Step | Validation Loss | F1 |
---|---|---|---|---|
2.0397 | 1.0 | 128 | 1.1359 | 0.6661 |
0.9617 | 2.0 | 256 | 0.9647 | 0.7399 |
0.6183 | 3.0 | 384 | 1.0818 | 0.7369 |
0.2399 | 4.0 | 512 | 1.2012 | 0.7504 |
0.144 | 5.0 | 640 | 1.4676 | 0.7292 |
0.103 | 6.0 | 768 | 1.5944 | 0.7329 |
0.0938 | 7.0 | 896 | 1.7955 | 0.7481 |
0.0424 | 8.0 | 1024 | 1.9380 | 0.7243 |
0.046 | 9.0 | 1152 | 2.0580 | 0.7499 |
0.0339 | 10.0 | 1280 | 1.8773 | 0.7423 |
0.0229 | 11.0 | 1408 | 2.0289 | 0.7640 |
0.0144 | 12.0 | 1536 | 1.8883 | 0.7340 |
0.0211 | 13.0 | 1664 | 2.0626 | 0.7414 |
0.036 | 14.0 | 1792 | 2.2439 | 0.7518 |
0.0109 | 15.0 | 1920 | 2.3987 | 0.7413 |
0.0138 | 16.0 | 2048 | 2.3014 | 0.7466 |
0.0198 | 17.0 | 2176 | 2.4039 | 0.7489 |
0.0156 | 18.0 | 2304 | 2.8102 | 0.7382 |
0.0248 | 19.0 | 2432 | 2.5557 | 0.7529 |
0.0277 | 20.0 | 2560 | 2.5543 | 0.7493 |
0.0172 | 21.0 | 2688 | 2.4589 | 0.7461 |
0.0138 | 22.0 | 2816 | 2.5041 | 0.7487 |
0.0053 | 23.0 | 2944 | 2.5957 | 0.7572 |
0.0095 | 24.0 | 3072 | 2.7520 | 0.7450 |
0.0078 | 25.0 | 3200 | 2.7041 | 0.7524 |
0.0148 | 26.0 | 3328 | 2.7179 | 0.7455 |
0.0235 | 27.0 | 3456 | 2.9401 | 0.7517 |
0.0228 | 28.0 | 3584 | 2.7662 | 0.7514 |
0.0093 | 29.0 | 3712 | 2.8684 | 0.7486 |
0.0292 | 30.0 | 3840 | 2.8222 | 0.7553 |
0.0144 | 31.0 | 3968 | 2.6264 | 0.7770 |
0.0074 | 32.0 | 4096 | 2.6289 | 0.7607 |
0.0227 | 33.0 | 4224 | 2.8246 | 0.7636 |
0.0047 | 34.0 | 4352 | 2.4470 | 0.7664 |
0.0125 | 35.0 | 4480 | 2.5086 | 0.7420 |
0.0337 | 36.0 | 4608 | 2.6435 | 0.7424 |
0.0191 | 37.0 | 4736 | 2.7223 | 0.7637 |
0.0225 | 38.0 | 4864 | 2.6102 | 0.7693 |
0.0095 | 39.0 | 4992 | 2.6686 | 0.7816 |
0.0013 | 40.0 | 5120 | 2.7064 | 0.7771 |
0.0012 | 41.0 | 5248 | 2.7012 | 0.7787 |
0.0017 | 42.0 | 5376 | 2.7380 | 0.7797 |
0.0011 | 43.0 | 5504 | 2.7412 | 0.7790 |
0.0007 | 44.0 | 5632 | 2.7403 | 0.7791 |
0.001 | 45.0 | 5760 | 2.6960 | 0.7782 |
0.001 | 46.0 | 5888 | 2.7245 | 0.7739 |
0.0013 | 47.0 | 6016 | 2.7277 | 0.7795 |
0.0007 | 48.0 | 6144 | 2.7371 | 0.7791 |
0.0004 | 49.0 | 6272 | 2.7430 | 0.7755 |
0.0011 | 50.0 | 6400 | 2.7259 | 0.7815 |
0.001 | 51.0 | 6528 | 2.7587 | 0.7786 |
0.0009 | 52.0 | 6656 | 2.7626 | 0.7803 |
0.0009 | 53.0 | 6784 | 2.7580 | 0.7806 |
0.0005 | 54.0 | 6912 | 2.7569 | 0.7804 |
0.0016 | 55.0 | 7040 | 2.7532 | 0.7782 |
0.0006 | 56.0 | 7168 | 2.7573 | 0.7817 |
0.0009 | 57.0 | 7296 | 2.7386 | 0.7790 |
0.0004 | 58.0 | 7424 | 2.7708 | 0.7806 |
0.0015 | 59.0 | 7552 | 2.7834 | 0.7806 |
0.0003 | 60.0 | 7680 | 2.7810 | 0.7809 |
0.0012 | 61.0 | 7808 | 2.7807 | 0.7785 |
0.0006 | 62.0 | 7936 | 2.7704 | 0.7827 |
0.0009 | 63.0 | 8064 | 2.7908 | 0.7827 |
0.0009 | 64.0 | 8192 | 2.7821 | 0.7781 |
0.0006 | 65.0 | 8320 | 2.7978 | 0.7827 |
0.0012 | 66.0 | 8448 | 2.7880 | 0.7804 |
0.0004 | 67.0 | 8576 | 2.8033 | 0.7825 |
0.0005 | 68.0 | 8704 | 2.8128 | 0.7749 |
0.0017 | 69.0 | 8832 | 2.8196 | 0.7821 |
0.0011 | 70.0 | 8960 | 2.8229 | 0.7800 |
0.0003 | 71.0 | 9088 | 2.8190 | 0.7800 |
0.001 | 72.0 | 9216 | 2.8253 | 0.7800 |
0.0002 | 73.0 | 9344 | 2.8128 | 0.7811 |
0.0013 | 74.0 | 9472 | 2.8295 | 0.7800 |
0.0009 | 75.0 | 9600 | 2.8273 | 0.7822 |
0.0007 | 76.0 | 9728 | 2.8397 | 0.7787 |
0.0008 | 77.0 | 9856 | 2.8416 | 0.7787 |
0.0008 | 78.0 | 9984 | 2.8112 | 0.7714 |
0.0005 | 79.0 | 10112 | 2.8341 | 0.7715 |
0.0013 | 80.0 | 10240 | 2.8288 | 0.7776 |
0.0011 | 81.0 | 10368 | 2.8894 | 0.7759 |
0.0009 | 82.0 | 10496 | 2.8632 | 0.7751 |
0.0011 | 83.0 | 10624 | 2.8610 | 0.7754 |
0.0005 | 84.0 | 10752 | 2.8867 | 0.7729 |
0.0006 | 85.0 | 10880 | 2.8607 | 0.7836 |
0.0008 | 86.0 | 11008 | 2.8683 | 0.7758 |
0.001 | 87.0 | 11136 | 2.8716 | 0.7782 |
0.0004 | 88.0 | 11264 | 2.8688 | 0.7782 |
0.0012 | 89.0 | 11392 | 2.8807 | 0.7764 |
0.0009 | 90.0 | 11520 | 2.8823 | 0.7749 |
0.001 | 91.0 | 11648 | 2.8813 | 0.7765 |
0.0 | 92.0 | 11776 | 2.8886 | 0.7782 |
0.0008 | 93.0 | 11904 | 2.8883 | 0.7727 |
0.0009 | 94.0 | 12032 | 2.8953 | 0.7710 |
0.0007 | 95.0 | 12160 | 2.8998 | 0.7764 |
0.0003 | 96.0 | 12288 | 2.9062 | 0.7756 |
0.0009 | 97.0 | 12416 | 2.9045 | 0.7748 |
0.0004 | 98.0 | 12544 | 2.9242 | 0.7749 |
0.0008 | 99.0 | 12672 | 2.8354 | 0.7785 |
0.0796 | 100.0 | 12800 | 2.5102 | 0.7457 |
0.1145 | 101.0 | 12928 | 2.6841 | 0.7522 |
0.0296 | 102.0 | 13056 | 2.8246 | 0.7323 |
0.0159 | 103.0 | 13184 | 2.7918 | 0.7340 |
0.0047 | 104.0 | 13312 | 2.8134 | 0.7407 |
0.0006 | 105.0 | 13440 | 2.8223 | 0.7396 |
0.001 | 106.0 | 13568 | 2.9223 | 0.7427 |
0.0021 | 107.0 | 13696 | 2.9052 | 0.7454 |
0.0006 | 108.0 | 13824 | 2.9146 | 0.7506 |
0.001 | 109.0 | 13952 | 2.9090 | 0.7486 |
0.0007 | 110.0 | 14080 | 2.9166 | 0.7526 |
0.0009 | 111.0 | 14208 | 2.9191 | 0.7466 |
0.0006 | 112.0 | 14336 | 2.9206 | 0.7500 |
0.0007 | 113.0 | 14464 | 2.9245 | 0.7500 |
0.0007 | 114.0 | 14592 | 2.9265 | 0.7521 |
0.0004 | 115.0 | 14720 | 2.9316 | 0.7501 |
0.0006 | 116.0 | 14848 | 2.9349 | 0.7539 |
0.0006 | 117.0 | 14976 | 2.9335 | 0.7519 |
0.0009 | 118.0 | 15104 | 2.9385 | 0.7495 |
0.0006 | 119.0 | 15232 | 2.9471 | 0.7519 |
0.0009 | 120.0 | 15360 | 2.9502 | 0.7493 |
0.0003 | 121.0 | 15488 | 2.9488 | 0.7516 |
0.0004 | 122.0 | 15616 | 2.9575 | 0.7516 |
0.001 | 123.0 | 15744 | 2.9512 | 0.7529 |
0.0005 | 124.0 | 15872 | 2.9597 | 0.7516 |
0.0005 | 125.0 | 16000 | 2.9621 | 0.7529 |
0.0009 | 126.0 | 16128 | 2.9661 | 0.7529 |
0.0006 | 127.0 | 16256 | 2.9651 | 0.7509 |
0.0007 | 128.0 | 16384 | 2.9742 | 0.7508 |
0.0007 | 129.0 | 16512 | 2.9781 | 0.7529 |
0.0007 | 130.0 | 16640 | 2.9807 | 0.7505 |
0.0004 | 131.0 | 16768 | 2.9855 | 0.7504 |
0.0007 | 132.0 | 16896 | 2.9804 | 0.7509 |
0.0002 | 133.0 | 17024 | 2.9816 | 0.7561 |
0.0012 | 134.0 | 17152 | 2.9952 | 0.7501 |
0.0007 | 135.0 | 17280 | 2.9941 | 0.7522 |
0.0009 | 136.0 | 17408 | 2.9973 | 0.7537 |
0.0004 | 137.0 | 17536 | 2.9987 | 0.7537 |
0.0014 | 138.0 | 17664 | 3.0021 | 0.7521 |
0.0004 | 139.0 | 17792 | 3.0044 | 0.7521 |
0.0007 | 140.0 | 17920 | 3.0095 | 0.7516 |
0.0004 | 141.0 | 18048 | 3.0137 | 0.7537 |
0.0005 | 142.0 | 18176 | 3.0210 | 0.7539 |
0.0007 | 143.0 | 18304 | 3.0259 | 0.7539 |
0.0004 | 144.0 | 18432 | 3.0249 | 0.7540 |
0.0011 | 145.0 | 18560 | 3.0261 | 0.7517 |
0.0006 | 146.0 | 18688 | 3.0299 | 0.7536 |
0.0009 | 147.0 | 18816 | 3.0334 | 0.7540 |
0.0002 | 148.0 | 18944 | 3.0421 | 0.7576 |
0.0011 | 149.0 | 19072 | 3.0379 | 0.7576 |
0.0007 | 150.0 | 19200 | 3.0405 | 0.7555 |
0.0006 | 151.0 | 19328 | 3.0509 | 0.7556 |
0.0009 | 152.0 | 19456 | 3.0489 | 0.7538 |
0.0004 | 153.0 | 19584 | 3.0532 | 0.7559 |
0.0009 | 154.0 | 19712 | 3.0591 | 0.7535 |
0.0006 | 155.0 | 19840 | 3.0563 | 0.7535 |
0.0007 | 156.0 | 19968 | 3.0635 | 0.7535 |
0.0004 | 157.0 | 20096 | 3.0679 | 0.7535 |
0.0009 | 158.0 | 20224 | 3.0686 | 0.7538 |
0.0004 | 159.0 | 20352 | 3.0719 | 0.7535 |
0.0005 | 160.0 | 20480 | 3.0798 | 0.7556 |
0.0004 | 161.0 | 20608 | 3.0773 | 0.7538 |
0.0009 | 162.0 | 20736 | 3.0802 | 0.7538 |
0.0008 | 163.0 | 20864 | 3.0832 | 0.7538 |
0.0002 | 164.0 | 20992 | 3.0835 | 0.7538 |
0.0006 | 165.0 | 21120 | 3.0912 | 0.7538 |
0.0009 | 166.0 | 21248 | 3.0921 | 0.7519 |
0.0004 | 167.0 | 21376 | 3.0970 | 0.7538 |
0.0005 | 168.0 | 21504 | 3.0997 | 0.7538 |
0.0008 | 169.0 | 21632 | 3.1082 | 0.7538 |
0.0006 | 170.0 | 21760 | 3.1084 | 0.7538 |
0.0002 | 171.0 | 21888 | 3.1156 | 0.7535 |
0.0004 | 172.0 | 22016 | 3.1164 | 0.7538 |
0.0006 | 173.0 | 22144 | 3.1149 | 0.7559 |
0.0009 | 174.0 | 22272 | 3.1236 | 0.7522 |
0.0008 | 175.0 | 22400 | 3.1219 | 0.7538 |
0.0008 | 176.0 | 22528 | 3.1236 | 0.7522 |
0.0004 | 177.0 | 22656 | 3.1242 | 0.7538 |
0.0008 | 178.0 | 22784 | 3.1230 | 0.7538 |
0.0005 | 179.0 | 22912 | 3.1316 | 0.7538 |
0.0002 | 180.0 | 23040 | 3.1308 | 0.7538 |
0.0006 | 181.0 | 23168 | 3.1302 | 0.7538 |
0.0012 | 182.0 | 23296 | 3.1332 | 0.7538 |
0.0008 | 183.0 | 23424 | 3.1409 | 0.7538 |
0.0006 | 184.0 | 23552 | 3.1350 | 0.7538 |
0.0004 | 185.0 | 23680 | 3.1352 | 0.7538 |
0.0008 | 186.0 | 23808 | 3.1401 | 0.7538 |
0.0006 | 187.0 | 23936 | 3.1409 | 0.7538 |
0.0006 | 188.0 | 24064 | 3.1387 | 0.7538 |
0.0004 | 189.0 | 24192 | 3.1466 | 0.7538 |
0.0004 | 190.0 | 24320 | 3.1518 | 0.7538 |
0.0006 | 191.0 | 24448 | 3.1505 | 0.7538 |
0.0008 | 192.0 | 24576 | 3.1482 | 0.7538 |
0.0008 | 193.0 | 24704 | 3.1458 | 0.7522 |
0.0004 | 194.0 | 24832 | 3.1473 | 0.7538 |
0.0006 | 195.0 | 24960 | 3.1493 | 0.7538 |
0.0012 | 196.0 | 25088 | 3.1480 | 0.7538 |
0.0006 | 197.0 | 25216 | 3.1499 | 0.7538 |
0.0008 | 198.0 | 25344 | 3.1504 | 0.7538 |
0.0006 | 199.0 | 25472 | 3.1493 | 0.7538 |
0.0006 | 200.0 | 25600 | 3.1500 | 0.7538 |
Framework versions
- Transformers 4.48.0.dev0
- Pytorch 2.4.1+cu121
- Datasets 3.1.0
- Tokenizers 0.21.0
- Downloads last month
- 10
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for BenPhan/ST1_modernbert-base_product-category_V1
Base model
answerdotai/ModernBERT-base