Edit model card

results_model3

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 5.8601

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 30
  • num_epochs: 60
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
No log 0.1141 1024 6.0292
No log 0.2282 2048 5.6435
No log 0.3422 3072 5.3662
No log 0.4563 4096 5.1810
No log 0.5704 5120 5.0305
No log 0.6845 6144 4.9323
No log 0.7986 7168 4.8141
No log 0.9127 8192 4.7390
5.3905 1.0267 9216 4.7245
5.3905 1.1408 10240 4.5491
5.3905 1.2549 11264 4.5115
5.3905 1.3690 12288 4.5454
5.3905 1.4831 13312 4.4235
5.3905 1.5971 14336 4.3937
5.3905 1.7112 15360 4.3675
5.3905 1.8253 16384 4.3321
5.3905 1.9394 17408 4.3125
4.3288 2.0535 18432 4.2470
4.3288 2.1676 19456 4.3321
4.3288 2.2816 20480 4.2336
4.3288 2.3957 21504 4.2677
4.3288 2.5098 22528 4.2491
4.3288 2.6239 23552 4.3101
4.3288 2.7380 24576 4.3300
4.3288 2.8520 25600 4.3947
4.3288 2.9661 26624 4.3634
3.853 3.0802 27648 4.2981
3.853 3.1943 28672 4.4073
3.853 3.3084 29696 4.3586
3.853 3.4225 30720 4.5024
3.853 3.5365 31744 4.6206
3.853 3.6506 32768 4.5310
3.853 3.7647 33792 4.6789
3.853 3.8788 34816 4.5824
3.853 3.9929 35840 4.4508
3.5631 4.1070 36864 4.5873
3.5631 4.2210 37888 4.5861
3.5631 4.3351 38912 4.6930
3.5631 4.4492 39936 4.6392
3.5631 4.5633 40960 4.5271
3.5631 4.6774 41984 4.8197
3.5631 4.7914 43008 4.7696
3.5631 4.9055 44032 4.6841
3.3655 5.0196 45056 4.8238
3.3655 5.1337 46080 4.8119
3.3655 5.2478 47104 4.8520
3.3655 5.3619 48128 4.8713
3.3655 5.4759 49152 4.8548
3.3655 5.5900 50176 4.7750
3.3655 5.7041 51200 5.0075
3.3655 5.8182 52224 4.9843
3.3655 5.9323 53248 4.8895
3.2228 6.0463 54272 5.0877
3.2228 6.1604 55296 4.6181
3.2228 6.2745 56320 4.7398
3.2228 6.3886 57344 4.6617
3.2228 6.5027 58368 4.8633
3.2228 6.6168 59392 4.9870
3.2228 6.7308 60416 5.0021
3.2228 6.8449 61440 4.7422
3.2228 6.9590 62464 4.9250
3.1155 7.0731 63488 4.7348
3.1155 7.1872 64512 4.8952
3.1155 7.3012 65536 4.8318
3.1155 7.4153 66560 4.8476
3.1155 7.5294 67584 5.0057
3.1155 7.6435 68608 4.9427
3.1155 7.7576 69632 4.8623
3.1155 7.8717 70656 4.8452
3.1155 7.9857 71680 4.8021
3.0315 8.0998 72704 4.7947
3.0315 8.2139 73728 4.8884
3.0315 8.3280 74752 4.8440
3.0315 8.4421 75776 4.8452
3.0315 8.5561 76800 4.8929
3.0315 8.6702 77824 4.8412
3.0315 8.7843 78848 4.8926
3.0315 8.8984 79872 4.6749
2.9626 9.0125 80896 4.9204
2.9626 9.1266 81920 4.7448
2.9626 9.2406 82944 4.6707
2.9626 9.3547 83968 4.7409
2.9626 9.4688 84992 4.7350
2.9626 9.5829 86016 4.7875
2.9626 9.6970 87040 4.7588
2.9626 9.8111 88064 4.6576
2.9626 9.9251 89088 4.6979
2.9057 10.0392 90112 4.7358
2.9057 10.1533 91136 4.7766
2.9057 10.2674 92160 4.7182
2.9057 10.3815 93184 4.7157
2.9057 10.4955 94208 4.6437
2.9057 10.6096 95232 4.6506
2.9057 10.7237 96256 4.6468
2.9057 10.8378 97280 4.6038
2.9057 10.9519 98304 4.7661
2.857 11.0660 99328 4.7447
2.857 11.1800 100352 4.5949
2.857 11.2941 101376 4.6705
2.857 11.4082 102400 4.7022
2.857 11.5223 103424 4.6394
2.857 11.6364 104448 4.7558
2.857 11.7504 105472 4.7065
2.857 11.8645 106496 4.4721
2.857 11.9786 107520 4.6075
2.8152 12.0927 108544 4.6613
2.8152 12.2068 109568 4.6763
2.8152 12.3209 110592 4.5310
2.8152 12.4349 111616 4.6142
2.8152 12.5490 112640 4.5820
2.8152 12.6631 113664 4.6034
2.8152 12.7772 114688 4.6213
2.8152 12.8913 115712 4.6057
2.7789 13.0053 116736 4.6278
2.7789 13.1194 117760 4.5454
2.7789 13.2335 118784 4.6711
2.7789 13.3476 119808 4.5148
2.7789 13.4617 120832 4.5424
2.7789 13.5758 121856 4.5074
2.7789 13.6898 122880 4.5448
2.7789 13.8039 123904 4.5084
2.7789 13.9180 124928 4.4962
2.7457 14.0321 125952 4.5270
2.7457 14.1462 126976 4.3997
2.7457 14.2602 128000 4.5798
2.7457 14.3743 129024 4.5839
2.7457 14.4884 130048 4.5679
2.7457 14.6025 131072 4.4674
2.7457 14.7166 132096 4.4471
2.7457 14.8307 133120 4.3811
2.7457 14.9447 134144 4.4387
2.7165 15.0588 135168 4.4756
2.7165 15.1729 136192 4.5638
2.7165 15.2870 137216 4.4033
2.7165 15.4011 138240 4.4876
2.7165 15.5152 139264 4.3874
2.7165 15.6292 140288 4.4200
2.7165 15.7433 141312 4.5077
2.7165 15.8574 142336 4.4537
2.7165 15.9715 143360 4.4381
2.6895 16.0856 144384 4.5636
2.6895 16.1996 145408 4.3530
2.6895 16.3137 146432 4.3760
2.6895 16.4278 147456 4.4327
2.6895 16.5419 148480 4.3666
2.6895 16.6560 149504 4.3708
2.6895 16.7701 150528 4.3945
2.6895 16.8841 151552 4.3781
2.6895 16.9982 152576 4.4506
2.6652 17.1123 153600 4.3923
2.6652 17.2264 154624 4.4244
2.6652 17.3405 155648 4.4576
2.6652 17.4545 156672 4.5156
2.6652 17.5686 157696 4.4249
2.6652 17.6827 158720 4.3867
2.6652 17.7968 159744 4.4360
2.6652 17.9109 160768 4.4036
2.6426 18.0250 161792 4.3103
2.6426 18.1390 162816 4.4384
2.6426 18.2531 163840 4.4340
2.6426 18.3672 164864 4.4168
2.6426 18.4813 165888 4.3282
2.6426 18.5954 166912 4.3200
2.6426 18.7094 167936 4.2999
2.6426 18.8235 168960 4.4347
2.6426 18.9376 169984 4.4230
2.6219 19.0517 171008 4.4185
2.6219 19.1658 172032 4.3904
2.6219 19.2799 173056 4.4376
2.6219 19.3939 174080 4.3366
2.6219 19.5080 175104 4.4409
2.6219 19.6221 176128 4.3827
2.6219 19.7362 177152 4.4327
2.6219 19.8503 178176 4.4141
2.6219 19.9643 179200 4.4321
2.6027 20.0784 180224 4.2911
2.6027 20.1925 181248 4.3532
2.6027 20.3066 182272 4.3809
2.6027 20.4207 183296 4.3316
2.6027 20.5348 184320 4.4209
2.6027 20.6488 185344 4.4665
2.6027 20.7629 186368 4.4491
2.6027 20.8770 187392 4.5202
2.6027 20.9911 188416 4.3736
2.5844 21.1052 189440 4.3502
2.5844 21.2193 190464 4.4119
2.5844 21.3333 191488 4.5101
2.5844 21.4474 192512 4.4317
2.5844 21.5615 193536 4.4820
2.5844 21.6756 194560 4.3390
2.5844 21.7897 195584 4.5056
2.5844 21.9037 196608 4.3455
2.567 22.0178 197632 4.4092
2.567 22.1319 198656 4.4035
2.567 22.2460 199680 4.3419
2.567 22.3601 200704 4.3855
2.567 22.4742 201728 4.4563
2.567 22.5882 202752 4.3289
2.567 22.7023 203776 4.3813
2.567 22.8164 204800 4.4430
2.567 22.9305 205824 4.4219
2.5508 23.0446 206848 4.3792
2.5508 23.1586 207872 4.3852
2.5508 23.2727 208896 4.3416
2.5508 23.3868 209920 4.4151
2.5508 23.5009 210944 4.4419
2.5508 23.6150 211968 4.3499
2.5508 23.7291 212992 4.3682
2.5508 23.8431 214016 4.4015
2.5508 23.9572 215040 4.4304
2.5357 24.0713 216064 4.3552
2.5357 24.1854 217088 4.4245
2.5357 24.2995 218112 4.3834
2.5357 24.4135 219136 4.4137
2.5357 24.5276 220160 4.3576
2.5357 24.6417 221184 4.4199
2.5357 24.7558 222208 4.3972
2.5357 24.8699 223232 4.3985
2.5357 24.9840 224256 4.4293
2.5209 25.0980 225280 4.4578
2.5209 25.2121 226304 4.4607
2.5209 25.3262 227328 4.4757
2.5209 25.4403 228352 4.4839
2.5209 25.5544 229376 4.4599
2.5209 25.6684 230400 4.4425
2.5209 25.7825 231424 4.4190
2.5209 25.8966 232448 4.4437
2.5076 26.0107 233472 4.4285
2.5076 26.1248 234496 4.4859
2.5076 26.2389 235520 4.4197
2.5076 26.3529 236544 4.4417
2.5076 26.4670 237568 4.3522
2.5076 26.5811 238592 4.3813
2.5076 26.6952 239616 4.4286
2.5076 26.8093 240640 4.4190
2.5076 26.9234 241664 4.4582
2.4948 27.0374 242688 4.4244
2.4948 27.1515 243712 4.4591
2.4948 27.2656 244736 4.3931
2.4948 27.3797 245760 4.3863
2.4948 27.4938 246784 4.4838
2.4948 27.6078 247808 4.4084
2.4948 27.7219 248832 4.4773
2.4948 27.8360 249856 4.5461
2.4948 27.9501 250880 4.4207
2.4821 28.0642 251904 4.4936
2.4821 28.1783 252928 4.4527
2.4821 28.2923 253952 4.5058
2.4821 28.4064 254976 4.4861
2.4821 28.5205 256000 4.4809
2.4821 28.6346 257024 4.4766
2.4821 28.7487 258048 4.4536
2.4821 28.8627 259072 4.4361
2.4821 28.9768 260096 4.4896
2.4703 29.0909 261120 4.4662
2.4703 29.2050 262144 4.4739
2.4703 29.3191 263168 4.4503
2.4703 29.4332 264192 4.4752
2.4703 29.5472 265216 4.4353
2.4703 29.6613 266240 4.5091
2.4703 29.7754 267264 4.5209
2.4703 29.8895 268288 4.4949
2.4586 30.0036 269312 4.4982
2.4586 30.1176 270336 4.4525
2.4586 30.2317 271360 4.4358
2.4586 30.3458 272384 4.4653
2.4586 30.4599 273408 4.4581
2.4586 30.5740 274432 4.4629
2.4586 30.6881 275456 4.4738
2.4586 30.8021 276480 4.4619
2.4586 30.9162 277504 4.4367
2.4484 31.0303 278528 4.4268
2.4484 31.1444 279552 4.4684
2.4484 31.2585 280576 4.5192
2.4484 31.3725 281600 4.4801
2.4484 31.4866 282624 4.4949
2.4484 31.6007 283648 4.5451
2.4484 31.7148 284672 4.4388
2.4484 31.8289 285696 4.4732
2.4484 31.9430 286720 4.5024
2.4377 32.0570 287744 4.4771
2.4377 32.1711 288768 4.4566
2.4377 32.2852 289792 4.5154
2.4377 32.3993 290816 4.4805
2.4377 32.5134 291840 4.4655
2.4377 32.6275 292864 4.4778
2.4377 32.7415 293888 4.4634
2.4377 32.8556 294912 4.4257
2.4377 32.9697 295936 4.4286
2.4278 33.0838 296960 4.4249
2.4278 33.1979 297984 4.4411
2.4278 33.3119 299008 4.5060
2.4278 33.4260 300032 4.5065
2.4278 33.5401 301056 4.5621
2.4278 33.6542 302080 4.5226
2.4278 33.7683 303104 4.5252
2.4278 33.8824 304128 4.5045
2.4278 33.9964 305152 4.5031
2.4191 34.1105 306176 4.5350
2.4191 34.2246 307200 4.5197
2.4191 34.3387 308224 4.4793
2.4191 34.4528 309248 4.4792
2.4191 34.5668 310272 4.4857
2.4191 34.6809 311296 4.5201
2.4191 34.7950 312320 4.5428
2.4191 34.9091 313344 4.5224
2.4105 35.0232 314368 4.5737
2.4105 35.1373 315392 4.5060
2.4105 35.2513 316416 4.5159
2.4105 35.3654 317440 4.5259
2.4105 35.4795 318464 4.5303
2.4105 35.5936 319488 4.5362
2.4105 35.7077 320512 4.4980
2.4105 35.8217 321536 4.5364
2.4105 35.9358 322560 4.5109
2.4027 36.0499 323584 4.5286
2.4027 36.1640 324608 4.5492
2.4027 36.2781 325632 4.5502
2.4027 36.3922 326656 4.5376
2.4027 36.5062 327680 4.5548
2.4027 36.6203 328704 4.5585
2.4027 36.7344 329728 4.5391
2.4027 36.8485 330752 4.5169
2.4027 36.9626 331776 4.5645
2.3949 37.0766 332800 4.5315
2.3949 37.1907 333824 4.5602
2.3949 37.3048 334848 4.5836
2.3949 37.4189 335872 4.5543
2.3949 37.5330 336896 4.5701
2.3949 37.6471 337920 4.5637
2.3949 37.7611 338944 4.5593
2.3949 37.8752 339968 4.5251
2.3949 37.9893 340992 4.5588
2.3886 38.1034 342016 4.5381
2.3886 38.2175 343040 4.5579
2.3886 38.3316 344064 4.5815
2.3886 38.4456 345088 4.5365
2.3886 38.5597 346112 4.5599
2.3886 38.6738 347136 4.5676
2.3886 38.7879 348160 4.5568
2.3886 38.9020 349184 4.5606
2.3827 39.0160 350208 4.5582
2.3827 39.1301 351232 4.5555
2.3827 39.2442 352256 4.5742
2.3827 39.3583 353280 4.5567
2.3827 39.4724 354304 4.5556
2.3827 39.5865 355328 4.5705
2.3827 39.7005 356352 4.5531
2.3827 39.8146 357376 4.5598
2.3827 39.9287 358400 4.5574
3.9984 40.0428 359424 5.7228
3.9984 40.1569 360448 5.6349
3.9984 40.2709 361472 5.6372
3.9984 40.3850 362496 5.5746
3.9984 40.4991 363520 5.5795
3.9984 40.6132 364544 5.5344
3.9984 40.7273 365568 5.5140
3.9984 40.8414 366592 5.4978
3.9984 40.9554 367616 5.4630
3.6244 41.0695 368640 5.4623
3.6244 41.1836 369664 5.4943
3.6244 41.2977 370688 5.4605
3.6244 41.4118 371712 5.5054
3.6244 41.5258 372736 5.4709
3.6244 41.6399 373760 5.5010
3.6244 41.7540 374784 5.5261
3.6244 41.8681 375808 5.5546
3.6244 41.9822 376832 5.5594
3.416 42.0963 377856 5.5247
3.416 42.2103 378880 5.5814
3.416 42.3244 379904 5.6016
3.416 42.4385 380928 5.5535
3.416 42.5526 381952 5.5606
3.416 42.6667 382976 5.5824
3.416 42.7807 384000 5.6214
3.416 42.8948 385024 5.6168
3.2543 43.0089 386048 5.6560
3.2543 43.1230 387072 5.6215
3.2543 43.2371 388096 5.7091
3.2543 43.3512 389120 5.7246
3.2543 43.4652 390144 5.6848
3.2543 43.5793 391168 5.7467
3.2543 43.6934 392192 5.7055
3.2543 43.8075 393216 5.7323
3.2543 43.9216 394240 5.7253
3.1132 44.0357 395264 5.7830
3.1132 44.1497 396288 5.7302
3.1132 44.2638 397312 5.7815
3.1132 44.3779 398336 5.7778
3.1132 44.4920 399360 5.8049
3.1132 44.6061 400384 5.7594
3.1132 44.7201 401408 5.7803
3.1132 44.8342 402432 5.8086
3.1132 44.9483 403456 5.8097
2.9936 45.0624 404480 5.8311
2.9936 45.1765 405504 5.8601

Framework versions

  • Transformers 4.40.2
  • Pytorch 2.3.0
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
0
Safetensors
Model size
52.8M params
Tensor type
F32
·
Inference API
Unable to determine this model’s pipeline type. Check the docs .