Edit model card

bslm-entity-extraction-mt5-base-include-desc-normalized-tr243k

This model is a fine-tuned version of google/mt5-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0123
  • Exact Match: 67.9183
  • F1 Score: 88.9881

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 2

Training results

Training Loss Epoch Step Validation Loss Exact Match F1 Score
0.7787 0.0328 500 0.5270 0.0 0.0
0.0903 0.0656 1000 0.0410 40.0656 75.4663
0.0428 0.0984 1500 0.0276 49.4349 81.9724
0.0462 0.1312 2000 0.0276 50.4375 81.5147
0.0304 0.1640 2500 0.0232 54.2836 84.0088
0.0274 0.1968 3000 0.0218 55.7237 84.5312
0.0251 0.2296 3500 0.0205 56.7262 84.8972
0.0252 0.2624 4000 0.0209 55.0492 84.5563
0.0236 0.2953 4500 0.0185 60.2443 86.0929
0.0221 0.3281 5000 0.0194 57.6376 85.3742
0.0226 0.3609 5500 0.0179 61.3015 86.3940
0.025 0.3937 6000 0.0176 59.8979 86.1283
0.0211 0.4265 6500 0.0178 60.5177 86.3265
0.0206 0.4593 7000 0.0166 61.3380 86.6077
0.0194 0.4921 7500 0.0170 60.3536 86.3744
0.0184 0.5249 8000 0.0159 63.2155 87.2735
0.0192 0.5577 8500 0.0164 61.8848 86.8659
0.0181 0.5905 9000 0.0158 62.1035 86.9785
0.0186 0.6233 9500 0.0156 62.7598 87.2376
0.018 0.6561 10000 0.0151 64.2727 87.6065
0.0171 0.6889 10500 0.0154 62.6139 87.2134
0.0183 0.7217 11000 0.0145 64.7102 87.8215
0.0182 0.7545 11500 0.0150 62.9420 87.3372
0.017 0.7873 12000 0.0141 65.0747 87.9349
0.0183 0.8202 12500 0.0148 62.6504 87.2310
0.0179 0.8530 13000 0.0138 65.4575 87.9886
0.017 0.8858 13500 0.0136 65.8221 88.1741
0.0168 0.9186 14000 0.0140 64.6555 87.8573
0.017 0.9514 14500 0.0135 66.2778 88.3458
0.0174 0.9842 15000 0.0140 64.8195 88.0423
0.0154 1.0170 15500 0.0138 65.8039 88.2375
0.0154 1.0498 16000 0.0135 66.3507 88.3934
0.015 1.0826 16500 0.0135 65.9861 88.3272
0.0151 1.1154 17000 0.0139 65.5851 88.2204
0.0153 1.1482 17500 0.0131 67.5355 88.7772
0.0148 1.1810 18000 0.0136 66.3507 88.4478
0.015 1.2138 18500 0.0134 66.4054 88.5039
0.0154 1.2466 19000 0.0133 66.5877 88.5994
0.0139 1.2794 19500 0.0132 66.1502 88.4829
0.0156 1.3122 20000 0.0131 66.9705 88.6868
0.016 1.3451 20500 0.0127 67.0252 88.7032
0.0143 1.3779 21000 0.0130 67.0252 88.7021
0.0159 1.4107 21500 0.0128 67.2803 88.8236
0.0133 1.4435 22000 0.0129 67.3168 88.8505
0.0131 1.4763 22500 0.0127 67.1892 88.8617
0.0137 1.5091 23000 0.0130 67.0434 88.7488
0.0133 1.5419 23500 0.0126 67.6449 88.9151
0.0144 1.5747 24000 0.0127 67.3533 88.8633
0.0142 1.6075 24500 0.0125 67.4809 88.9516
0.0136 1.6403 25000 0.0128 66.8246 88.7465
0.0139 1.6731 25500 0.0132 66.0955 88.5128
0.0126 1.7059 26000 0.0127 67.8090 89.0277
0.0135 1.7387 26500 0.0126 67.5173 88.9308
0.0141 1.7715 27000 0.0124 67.6449 88.9314
0.0138 1.8043 27500 0.0123 68.0095 89.0386
0.0141 1.8371 28000 0.0123 68.0095 88.9919
0.0142 1.8700 28500 0.0121 68.4470 89.0863
0.0147 1.9028 29000 0.0124 67.8454 88.9933
0.0135 1.9356 29500 0.0124 67.6814 88.9077
0.014 1.9684 30000 0.0123 67.9183 88.9881

Framework versions

  • Transformers 4.45.2
  • Pytorch 2.4.1+cu121
  • Datasets 3.0.1
  • Tokenizers 0.20.1
Downloads last month
4
Safetensors
Model size
582M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for MittyN/bslm-entity-extraction-mt5-base-include-desc-normalized-tr243k

Base model

google/mt5-base
Finetuned
(154)
this model