ST1_modernbert-base_product-category_V1

This model is a fine-tuned version of answerdotai/ModernBERT-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 3.1500
  • F1: 0.7538

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 36
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 200

Training results

Training Loss Epoch Step Validation Loss F1
2.0397 1.0 128 1.1359 0.6661
0.9617 2.0 256 0.9647 0.7399
0.6183 3.0 384 1.0818 0.7369
0.2399 4.0 512 1.2012 0.7504
0.144 5.0 640 1.4676 0.7292
0.103 6.0 768 1.5944 0.7329
0.0938 7.0 896 1.7955 0.7481
0.0424 8.0 1024 1.9380 0.7243
0.046 9.0 1152 2.0580 0.7499
0.0339 10.0 1280 1.8773 0.7423
0.0229 11.0 1408 2.0289 0.7640
0.0144 12.0 1536 1.8883 0.7340
0.0211 13.0 1664 2.0626 0.7414
0.036 14.0 1792 2.2439 0.7518
0.0109 15.0 1920 2.3987 0.7413
0.0138 16.0 2048 2.3014 0.7466
0.0198 17.0 2176 2.4039 0.7489
0.0156 18.0 2304 2.8102 0.7382
0.0248 19.0 2432 2.5557 0.7529
0.0277 20.0 2560 2.5543 0.7493
0.0172 21.0 2688 2.4589 0.7461
0.0138 22.0 2816 2.5041 0.7487
0.0053 23.0 2944 2.5957 0.7572
0.0095 24.0 3072 2.7520 0.7450
0.0078 25.0 3200 2.7041 0.7524
0.0148 26.0 3328 2.7179 0.7455
0.0235 27.0 3456 2.9401 0.7517
0.0228 28.0 3584 2.7662 0.7514
0.0093 29.0 3712 2.8684 0.7486
0.0292 30.0 3840 2.8222 0.7553
0.0144 31.0 3968 2.6264 0.7770
0.0074 32.0 4096 2.6289 0.7607
0.0227 33.0 4224 2.8246 0.7636
0.0047 34.0 4352 2.4470 0.7664
0.0125 35.0 4480 2.5086 0.7420
0.0337 36.0 4608 2.6435 0.7424
0.0191 37.0 4736 2.7223 0.7637
0.0225 38.0 4864 2.6102 0.7693
0.0095 39.0 4992 2.6686 0.7816
0.0013 40.0 5120 2.7064 0.7771
0.0012 41.0 5248 2.7012 0.7787
0.0017 42.0 5376 2.7380 0.7797
0.0011 43.0 5504 2.7412 0.7790
0.0007 44.0 5632 2.7403 0.7791
0.001 45.0 5760 2.6960 0.7782
0.001 46.0 5888 2.7245 0.7739
0.0013 47.0 6016 2.7277 0.7795
0.0007 48.0 6144 2.7371 0.7791
0.0004 49.0 6272 2.7430 0.7755
0.0011 50.0 6400 2.7259 0.7815
0.001 51.0 6528 2.7587 0.7786
0.0009 52.0 6656 2.7626 0.7803
0.0009 53.0 6784 2.7580 0.7806
0.0005 54.0 6912 2.7569 0.7804
0.0016 55.0 7040 2.7532 0.7782
0.0006 56.0 7168 2.7573 0.7817
0.0009 57.0 7296 2.7386 0.7790
0.0004 58.0 7424 2.7708 0.7806
0.0015 59.0 7552 2.7834 0.7806
0.0003 60.0 7680 2.7810 0.7809
0.0012 61.0 7808 2.7807 0.7785
0.0006 62.0 7936 2.7704 0.7827
0.0009 63.0 8064 2.7908 0.7827
0.0009 64.0 8192 2.7821 0.7781
0.0006 65.0 8320 2.7978 0.7827
0.0012 66.0 8448 2.7880 0.7804
0.0004 67.0 8576 2.8033 0.7825
0.0005 68.0 8704 2.8128 0.7749
0.0017 69.0 8832 2.8196 0.7821
0.0011 70.0 8960 2.8229 0.7800
0.0003 71.0 9088 2.8190 0.7800
0.001 72.0 9216 2.8253 0.7800
0.0002 73.0 9344 2.8128 0.7811
0.0013 74.0 9472 2.8295 0.7800
0.0009 75.0 9600 2.8273 0.7822
0.0007 76.0 9728 2.8397 0.7787
0.0008 77.0 9856 2.8416 0.7787
0.0008 78.0 9984 2.8112 0.7714
0.0005 79.0 10112 2.8341 0.7715
0.0013 80.0 10240 2.8288 0.7776
0.0011 81.0 10368 2.8894 0.7759
0.0009 82.0 10496 2.8632 0.7751
0.0011 83.0 10624 2.8610 0.7754
0.0005 84.0 10752 2.8867 0.7729
0.0006 85.0 10880 2.8607 0.7836
0.0008 86.0 11008 2.8683 0.7758
0.001 87.0 11136 2.8716 0.7782
0.0004 88.0 11264 2.8688 0.7782
0.0012 89.0 11392 2.8807 0.7764
0.0009 90.0 11520 2.8823 0.7749
0.001 91.0 11648 2.8813 0.7765
0.0 92.0 11776 2.8886 0.7782
0.0008 93.0 11904 2.8883 0.7727
0.0009 94.0 12032 2.8953 0.7710
0.0007 95.0 12160 2.8998 0.7764
0.0003 96.0 12288 2.9062 0.7756
0.0009 97.0 12416 2.9045 0.7748
0.0004 98.0 12544 2.9242 0.7749
0.0008 99.0 12672 2.8354 0.7785
0.0796 100.0 12800 2.5102 0.7457
0.1145 101.0 12928 2.6841 0.7522
0.0296 102.0 13056 2.8246 0.7323
0.0159 103.0 13184 2.7918 0.7340
0.0047 104.0 13312 2.8134 0.7407
0.0006 105.0 13440 2.8223 0.7396
0.001 106.0 13568 2.9223 0.7427
0.0021 107.0 13696 2.9052 0.7454
0.0006 108.0 13824 2.9146 0.7506
0.001 109.0 13952 2.9090 0.7486
0.0007 110.0 14080 2.9166 0.7526
0.0009 111.0 14208 2.9191 0.7466
0.0006 112.0 14336 2.9206 0.7500
0.0007 113.0 14464 2.9245 0.7500
0.0007 114.0 14592 2.9265 0.7521
0.0004 115.0 14720 2.9316 0.7501
0.0006 116.0 14848 2.9349 0.7539
0.0006 117.0 14976 2.9335 0.7519
0.0009 118.0 15104 2.9385 0.7495
0.0006 119.0 15232 2.9471 0.7519
0.0009 120.0 15360 2.9502 0.7493
0.0003 121.0 15488 2.9488 0.7516
0.0004 122.0 15616 2.9575 0.7516
0.001 123.0 15744 2.9512 0.7529
0.0005 124.0 15872 2.9597 0.7516
0.0005 125.0 16000 2.9621 0.7529
0.0009 126.0 16128 2.9661 0.7529
0.0006 127.0 16256 2.9651 0.7509
0.0007 128.0 16384 2.9742 0.7508
0.0007 129.0 16512 2.9781 0.7529
0.0007 130.0 16640 2.9807 0.7505
0.0004 131.0 16768 2.9855 0.7504
0.0007 132.0 16896 2.9804 0.7509
0.0002 133.0 17024 2.9816 0.7561
0.0012 134.0 17152 2.9952 0.7501
0.0007 135.0 17280 2.9941 0.7522
0.0009 136.0 17408 2.9973 0.7537
0.0004 137.0 17536 2.9987 0.7537
0.0014 138.0 17664 3.0021 0.7521
0.0004 139.0 17792 3.0044 0.7521
0.0007 140.0 17920 3.0095 0.7516
0.0004 141.0 18048 3.0137 0.7537
0.0005 142.0 18176 3.0210 0.7539
0.0007 143.0 18304 3.0259 0.7539
0.0004 144.0 18432 3.0249 0.7540
0.0011 145.0 18560 3.0261 0.7517
0.0006 146.0 18688 3.0299 0.7536
0.0009 147.0 18816 3.0334 0.7540
0.0002 148.0 18944 3.0421 0.7576
0.0011 149.0 19072 3.0379 0.7576
0.0007 150.0 19200 3.0405 0.7555
0.0006 151.0 19328 3.0509 0.7556
0.0009 152.0 19456 3.0489 0.7538
0.0004 153.0 19584 3.0532 0.7559
0.0009 154.0 19712 3.0591 0.7535
0.0006 155.0 19840 3.0563 0.7535
0.0007 156.0 19968 3.0635 0.7535
0.0004 157.0 20096 3.0679 0.7535
0.0009 158.0 20224 3.0686 0.7538
0.0004 159.0 20352 3.0719 0.7535
0.0005 160.0 20480 3.0798 0.7556
0.0004 161.0 20608 3.0773 0.7538
0.0009 162.0 20736 3.0802 0.7538
0.0008 163.0 20864 3.0832 0.7538
0.0002 164.0 20992 3.0835 0.7538
0.0006 165.0 21120 3.0912 0.7538
0.0009 166.0 21248 3.0921 0.7519
0.0004 167.0 21376 3.0970 0.7538
0.0005 168.0 21504 3.0997 0.7538
0.0008 169.0 21632 3.1082 0.7538
0.0006 170.0 21760 3.1084 0.7538
0.0002 171.0 21888 3.1156 0.7535
0.0004 172.0 22016 3.1164 0.7538
0.0006 173.0 22144 3.1149 0.7559
0.0009 174.0 22272 3.1236 0.7522
0.0008 175.0 22400 3.1219 0.7538
0.0008 176.0 22528 3.1236 0.7522
0.0004 177.0 22656 3.1242 0.7538
0.0008 178.0 22784 3.1230 0.7538
0.0005 179.0 22912 3.1316 0.7538
0.0002 180.0 23040 3.1308 0.7538
0.0006 181.0 23168 3.1302 0.7538
0.0012 182.0 23296 3.1332 0.7538
0.0008 183.0 23424 3.1409 0.7538
0.0006 184.0 23552 3.1350 0.7538
0.0004 185.0 23680 3.1352 0.7538
0.0008 186.0 23808 3.1401 0.7538
0.0006 187.0 23936 3.1409 0.7538
0.0006 188.0 24064 3.1387 0.7538
0.0004 189.0 24192 3.1466 0.7538
0.0004 190.0 24320 3.1518 0.7538
0.0006 191.0 24448 3.1505 0.7538
0.0008 192.0 24576 3.1482 0.7538
0.0008 193.0 24704 3.1458 0.7522
0.0004 194.0 24832 3.1473 0.7538
0.0006 195.0 24960 3.1493 0.7538
0.0012 196.0 25088 3.1480 0.7538
0.0006 197.0 25216 3.1499 0.7538
0.0008 198.0 25344 3.1504 0.7538
0.0006 199.0 25472 3.1493 0.7538
0.0006 200.0 25600 3.1500 0.7538

Framework versions

  • Transformers 4.48.0.dev0
  • Pytorch 2.4.1+cu121
  • Datasets 3.1.0
  • Tokenizers 0.21.0
Downloads last month
10
Safetensors
Model size
150M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for BenPhan/ST1_modernbert-base_product-category_V1

Finetuned
(209)
this model