Visualize in Weights & Biases

OpenELM

This model is a fine-tuned version of apple/OpenELM-1_1B on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8217

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 3407
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 5
  • num_epochs: 1
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
1.4857 0.0041 10 1.3911
1.3665 0.0082 20 1.2476
1.2776 0.0123 30 1.1732
1.1933 0.0164 40 1.1347
1.1747 0.0205 50 1.1082
1.1433 0.0246 60 1.0864
1.1225 0.0288 70 1.0698
1.0967 0.0329 80 1.0541
1.075 0.0370 90 1.0411
1.0551 0.0411 100 1.0316
1.0587 0.0452 110 1.0231
1.0432 0.0493 120 1.0160
1.0512 0.0534 130 1.0095
1.0527 0.0575 140 1.0042
1.032 0.0616 150 0.9989
1.0277 0.0657 160 0.9936
1.0316 0.0698 170 0.9890
1.0225 0.0739 180 0.9848
1.007 0.0780 190 0.9804
0.9918 0.0822 200 0.9769
1.0152 0.0863 210 0.9734
0.9872 0.0904 220 0.9703
0.9972 0.0945 230 0.9670
1.0098 0.0986 240 0.9639
0.9869 0.1027 250 0.9607
0.9829 0.1068 260 0.9581
0.9983 0.1109 270 0.9556
0.9973 0.1150 280 0.9527
0.9848 0.1191 290 0.9505
0.9734 0.1232 300 0.9478
0.9677 0.1273 310 0.9451
0.9638 0.1314 320 0.9434
0.9654 0.1356 330 0.9411
0.9653 0.1397 340 0.9389
0.976 0.1438 350 0.9370
0.9627 0.1479 360 0.9355
0.9533 0.1520 370 0.9331
0.9441 0.1561 380 0.9309
0.958 0.1602 390 0.9294
0.9467 0.1643 400 0.9273
0.9412 0.1684 410 0.9254
0.9632 0.1725 420 0.9237
0.9248 0.1766 430 0.9218
0.9384 0.1807 440 0.9204
0.9407 0.1848 450 0.9187
0.9439 0.1890 460 0.9170
0.9353 0.1931 470 0.9154
0.9346 0.1972 480 0.9139
0.9373 0.2013 490 0.9121
0.936 0.2054 500 0.9107
0.9375 0.2095 510 0.9096
0.9456 0.2136 520 0.9076
0.9354 0.2177 530 0.9065
0.9173 0.2218 540 0.9052
0.921 0.2259 550 0.9042
0.9233 0.2300 560 0.9025
0.9338 0.2341 570 0.9012
0.918 0.2382 580 0.8996
0.9221 0.2424 590 0.8985
0.903 0.2465 600 0.8973
0.9094 0.2506 610 0.8965
0.9077 0.2547 620 0.8953
0.9076 0.2588 630 0.8944
0.9304 0.2629 640 0.8931
0.9118 0.2670 650 0.8917
0.9131 0.2711 660 0.8910
0.9213 0.2752 670 0.8901
0.901 0.2793 680 0.8891
0.9089 0.2834 690 0.8882
0.9152 0.2875 700 0.8871
0.9138 0.2916 710 0.8863
0.8988 0.2958 720 0.8849
0.8945 0.2999 730 0.8843
0.9104 0.3040 740 0.8836
0.919 0.3081 750 0.8826
0.9049 0.3122 760 0.8815
0.8834 0.3163 770 0.8806
0.9053 0.3204 780 0.8795
0.9039 0.3245 790 0.8789
0.9018 0.3286 800 0.8781
0.8847 0.3327 810 0.8775
0.8884 0.3368 820 0.8760
0.8867 0.3409 830 0.8756
0.8782 0.3450 840 0.8747
0.8765 0.3492 850 0.8737
0.8862 0.3533 860 0.8733
0.889 0.3574 870 0.8722
0.8997 0.3615 880 0.8716
0.8706 0.3656 890 0.8708
0.8982 0.3697 900 0.8701
0.8792 0.3738 910 0.8693
0.8869 0.3779 920 0.8686
0.8704 0.3820 930 0.8678
0.8902 0.3861 940 0.8676
0.8827 0.3902 950 0.8667
0.8832 0.3943 960 0.8662
0.883 0.3984 970 0.8650
0.8803 0.4026 980 0.8642
0.8605 0.4067 990 0.8634
0.8838 0.4108 1000 0.8627
0.8878 0.4149 1010 0.8623
0.8835 0.4190 1020 0.8614
0.8597 0.4231 1030 0.8609
0.8648 0.4272 1040 0.8603
0.8847 0.4313 1050 0.8598
0.8921 0.4354 1060 0.8592
0.8718 0.4395 1070 0.8590
0.8829 0.4436 1080 0.8583
0.8715 0.4477 1090 0.8576
0.8736 0.4518 1100 0.8570
0.8611 0.4560 1110 0.8563
0.872 0.4601 1120 0.8558
0.8756 0.4642 1130 0.8554
0.8793 0.4683 1140 0.8548
0.8872 0.4724 1150 0.8545
0.8719 0.4765 1160 0.8539
0.8699 0.4806 1170 0.8536
0.8779 0.4847 1180 0.8527
0.876 0.4888 1190 0.8526
0.8777 0.4929 1200 0.8519
0.8552 0.4970 1210 0.8514
0.8717 0.5011 1220 0.8508
0.879 0.5053 1230 0.8502
0.8606 0.5094 1240 0.8499
0.865 0.5135 1250 0.8492
0.8723 0.5176 1260 0.8489
0.8685 0.5217 1270 0.8485
0.8521 0.5258 1280 0.8480
0.8666 0.5299 1290 0.8475
0.8621 0.5340 1300 0.8473
0.8509 0.5381 1310 0.8469
0.8604 0.5422 1320 0.8462
0.8692 0.5463 1330 0.8459
0.8684 0.5504 1340 0.8454
0.8701 0.5545 1350 0.8451
0.856 0.5587 1360 0.8445
0.8578 0.5628 1370 0.8439
0.862 0.5669 1380 0.8435
0.8563 0.5710 1390 0.8431
0.8503 0.5751 1400 0.8428
0.857 0.5792 1410 0.8425
0.8468 0.5833 1420 0.8419
0.8555 0.5874 1430 0.8415
0.8398 0.5915 1440 0.8412
0.8649 0.5956 1450 0.8407
0.8495 0.5997 1460 0.8404
0.855 0.6038 1470 0.8401
0.8531 0.6079 1480 0.8397
0.8614 0.6121 1490 0.8391
0.8481 0.6162 1500 0.8389
0.861 0.6203 1510 0.8385
0.8426 0.6244 1520 0.8384
0.8494 0.6285 1530 0.8380
0.8475 0.6326 1540 0.8375
0.8563 0.6367 1550 0.8372
0.8372 0.6408 1560 0.8369
0.8567 0.6449 1570 0.8366
0.8555 0.6490 1580 0.8365
0.8435 0.6531 1590 0.8361
0.8533 0.6572 1600 0.8356
0.8431 0.6613 1610 0.8353
0.8577 0.6655 1620 0.8352
0.854 0.6696 1630 0.8347
0.8376 0.6737 1640 0.8347
0.8403 0.6778 1650 0.8343
0.8629 0.6819 1660 0.8340
0.841 0.6860 1670 0.8337
0.8339 0.6901 1680 0.8334
0.855 0.6942 1690 0.8331
0.8391 0.6983 1700 0.8327
0.8488 0.7024 1710 0.8324
0.8458 0.7065 1720 0.8322
0.8495 0.7106 1730 0.8319
0.8543 0.7147 1740 0.8317
0.8453 0.7189 1750 0.8317
0.8378 0.7230 1760 0.8313
0.8447 0.7271 1770 0.8309
0.8505 0.7312 1780 0.8306
0.8384 0.7353 1790 0.8303
0.824 0.7394 1800 0.8302
0.8574 0.7435 1810 0.8298
0.8365 0.7476 1820 0.8296
0.853 0.7517 1830 0.8294
0.8409 0.7558 1840 0.8292
0.8417 0.7599 1850 0.8290
0.8413 0.7640 1860 0.8288
0.8294 0.7681 1870 0.8286
0.8535 0.7723 1880 0.8283
0.8352 0.7764 1890 0.8281
0.8411 0.7805 1900 0.8281
0.8498 0.7846 1910 0.8279
0.8322 0.7887 1920 0.8276
0.8504 0.7928 1930 0.8273
0.8274 0.7969 1940 0.8272
0.8378 0.8010 1950 0.8269
0.8364 0.8051 1960 0.8268
0.8395 0.8092 1970 0.8267
0.8472 0.8133 1980 0.8264
0.8577 0.8174 1990 0.8262
0.8277 0.8215 2000 0.8259
0.8371 0.8257 2010 0.8258
0.8477 0.8298 2020 0.8256
0.8282 0.8339 2030 0.8256
0.8335 0.8380 2040 0.8255
0.8323 0.8421 2050 0.8253
0.8319 0.8462 2060 0.8251
0.8126 0.8503 2070 0.8250
0.8436 0.8544 2080 0.8249
0.8248 0.8585 2090 0.8248
0.8261 0.8626 2100 0.8245
0.8234 0.8667 2110 0.8244
0.8592 0.8708 2120 0.8243
0.8275 0.8749 2130 0.8242
0.8426 0.8791 2140 0.8240
0.8433 0.8832 2150 0.8240
0.8281 0.8873 2160 0.8239
0.8381 0.8914 2170 0.8237
0.8382 0.8955 2180 0.8235
0.8164 0.8996 2190 0.8234
0.8343 0.9037 2200 0.8233
0.8367 0.9078 2210 0.8231
0.837 0.9119 2220 0.8230
0.8245 0.9160 2230 0.8229
0.8489 0.9201 2240 0.8228
0.8391 0.9242 2250 0.8227
0.8341 0.9283 2260 0.8227
0.8442 0.9325 2270 0.8226
0.8302 0.9366 2280 0.8225
0.832 0.9407 2290 0.8224
0.833 0.9448 2300 0.8223
0.8313 0.9489 2310 0.8223
0.8444 0.9530 2320 0.8222
0.8405 0.9571 2330 0.8221
0.8433 0.9612 2340 0.8221
0.8348 0.9653 2350 0.8220
0.8355 0.9694 2360 0.8219
0.8361 0.9735 2370 0.8219
0.8254 0.9776 2380 0.8219
0.8371 0.9817 2390 0.8218
0.8304 0.9859 2400 0.8218
0.8169 0.9900 2410 0.8218
0.8219 0.9941 2420 0.8217
0.833 0.9982 2430 0.8217

Framework versions

  • PEFT 0.10.0
  • Transformers 4.41.0.dev0
  • Pytorch 2.3.1+cu121
  • Datasets 2.19.0
  • Tokenizers 0.19.1
Downloads last month
108
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for thaisonatk/OpenELM

Base model

apple/OpenELM-1_1B
Adapter
(2)
this model