Edit model card

pythia-70m-deduped-finetuned-github_cybersecurity_READMEs

This model is a fine-tuned version of EleutherAI/pythia-70m-deduped on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 7.1003
  • Accuracy: 0.0669

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 1000
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 0.97 14 33.1751 0.0595
No log 2.0 29 32.9604 0.0635
No log 2.97 43 32.7028 0.0655
No log 4.0 58 32.3567 0.0674
No log 4.97 72 27.9492 0.0686
No log 6.0 87 6.4475 0.0665
No log 6.97 101 5.7208 0.0645
No log 8.0 116 5.4807 0.0690
No log 8.97 130 5.3024 0.0670
No log 10.0 145 5.1200 0.0640
No log 10.97 159 5.0031 0.0850
No log 12.0 174 4.9063 0.0845
No log 12.97 188 4.8488 0.0849
No log 14.0 203 4.7995 0.0827
No log 14.97 217 4.7393 0.0830
No log 16.0 232 4.6867 0.0812
No log 16.97 246 4.6346 0.0809
No log 18.0 261 4.5873 0.0801
No log 18.97 275 4.5435 0.0793
No log 20.0 290 4.4955 0.0780
No log 20.97 304 4.4505 0.0770
No log 22.0 319 4.4044 0.0760
No log 22.97 333 4.3258 0.0782
No log 24.0 348 4.2926 0.0760
No log 24.97 362 4.2353 0.0769
No log 26.0 377 4.2157 0.0751
No log 26.97 391 4.1705 0.0752
No log 28.0 406 4.1310 0.0754
No log 28.97 420 4.0981 0.0752
No log 30.0 435 4.0909 0.0733
No log 30.97 449 4.0291 0.0743
No log 32.0 464 4.0761 0.0721
No log 32.97 478 3.9794 0.0727
No log 34.0 493 3.9521 0.0733
8.0484 34.97 507 3.9421 0.0733
8.0484 36.0 522 3.9310 0.0727
8.0484 36.97 536 3.9142 0.0728
8.0484 38.0 551 3.9338 0.0723
8.0484 38.97 565 3.9189 0.0716
8.0484 40.0 580 3.9186 0.0718
8.0484 40.97 594 3.9216 0.0722
8.0484 42.0 609 3.8944 0.0718
8.0484 42.97 623 3.9038 0.0705
8.0484 44.0 638 3.9371 0.0707
8.0484 44.97 652 3.8716 0.0714
8.0484 46.0 667 3.9153 0.0705
8.0484 46.97 681 3.9540 0.0703
8.0484 48.0 696 3.9973 0.0706
8.0484 48.97 710 4.0011 0.0701
8.0484 50.0 725 4.0547 0.0696
8.0484 50.97 739 4.1899 0.0693
8.0484 52.0 754 4.1240 0.0707
8.0484 52.97 768 4.2480 0.0699
8.0484 54.0 783 4.2986 0.0691
8.0484 54.97 797 4.2061 0.0695
8.0484 56.0 812 4.3689 0.0695
8.0484 56.97 826 4.4121 0.0688
8.0484 58.0 841 4.4500 0.0686
8.0484 58.97 855 4.6004 0.0686
8.0484 60.0 870 4.6357 0.0680
8.0484 60.97 884 4.8464 0.0684
8.0484 62.0 899 4.6806 0.0687
8.0484 62.97 913 4.8374 0.0682
8.0484 64.0 928 4.8653 0.0679
8.0484 64.97 942 5.0424 0.0680
8.0484 66.0 957 5.1518 0.0680
8.0484 66.97 971 5.1240 0.0683
8.0484 68.0 986 5.1661 0.0678
1.9559 68.97 1000 5.3992 0.0687
1.9559 70.0 1015 5.4876 0.0680
1.9559 70.97 1029 5.5609 0.0683
1.9559 72.0 1044 5.6707 0.0679
1.9559 72.97 1058 5.7551 0.0667
1.9559 74.0 1073 5.9036 0.0675
1.9559 74.97 1087 6.1355 0.0665
1.9559 76.0 1102 6.2995 0.0661
1.9559 76.97 1116 6.2546 0.0677
1.9559 78.0 1131 6.3169 0.0672
1.9559 78.97 1145 6.3377 0.0669
1.9559 80.0 1160 6.4969 0.0673
1.9559 80.97 1174 6.6636 0.0664
1.9559 82.0 1189 6.7550 0.0672
1.9559 82.97 1203 6.7044 0.0661
1.9559 84.0 1218 6.7713 0.0669
1.9559 84.97 1232 6.8595 0.0668
1.9559 86.0 1247 6.9219 0.0663
1.9559 86.97 1261 6.9174 0.0666
1.9559 88.0 1276 6.9158 0.0667
1.9559 88.97 1290 6.9744 0.0670
1.9559 90.0 1305 6.9375 0.0669
1.9559 90.97 1319 6.9947 0.0668
1.9559 92.0 1334 7.0421 0.0671
1.9559 92.97 1348 7.0240 0.0666
1.9559 94.0 1363 7.0480 0.0669
1.9559 94.97 1377 7.0679 0.0668
1.9559 96.0 1392 7.1026 0.0670
1.9559 96.55 1400 7.1003 0.0669

Framework versions

  • Transformers 4.40.0.dev0
  • Pytorch 2.2.1+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
15
Safetensors
Model size
70.4M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for sickcell/pythia-70m-deduped-finetuned-github_cybersecurity_READMEs

Finetuned
(82)
this model