pythia-70m-deduped-finetuned-github_cybersecurity_READMEs

This model is a fine-tuned version of EleutherAI/pythia-70m-deduped on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 7.1003
Accuracy: 0.0669

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 32
eval_batch_size: 32
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 128
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 1000
num_epochs: 100

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	0.97	14	33.1751	0.0595
No log	2.0	29	32.9604	0.0635
No log	2.97	43	32.7028	0.0655
No log	4.0	58	32.3567	0.0674
No log	4.97	72	27.9492	0.0686
No log	6.0	87	6.4475	0.0665
No log	6.97	101	5.7208	0.0645
No log	8.0	116	5.4807	0.0690
No log	8.97	130	5.3024	0.0670
No log	10.0	145	5.1200	0.0640
No log	10.97	159	5.0031	0.0850
No log	12.0	174	4.9063	0.0845
No log	12.97	188	4.8488	0.0849
No log	14.0	203	4.7995	0.0827
No log	14.97	217	4.7393	0.0830
No log	16.0	232	4.6867	0.0812
No log	16.97	246	4.6346	0.0809
No log	18.0	261	4.5873	0.0801
No log	18.97	275	4.5435	0.0793
No log	20.0	290	4.4955	0.0780
No log	20.97	304	4.4505	0.0770
No log	22.0	319	4.4044	0.0760
No log	22.97	333	4.3258	0.0782
No log	24.0	348	4.2926	0.0760
No log	24.97	362	4.2353	0.0769
No log	26.0	377	4.2157	0.0751
No log	26.97	391	4.1705	0.0752
No log	28.0	406	4.1310	0.0754
No log	28.97	420	4.0981	0.0752
No log	30.0	435	4.0909	0.0733
No log	30.97	449	4.0291	0.0743
No log	32.0	464	4.0761	0.0721
No log	32.97	478	3.9794	0.0727
No log	34.0	493	3.9521	0.0733
8.0484	34.97	507	3.9421	0.0733
8.0484	36.0	522	3.9310	0.0727
8.0484	36.97	536	3.9142	0.0728
8.0484	38.0	551	3.9338	0.0723
8.0484	38.97	565	3.9189	0.0716
8.0484	40.0	580	3.9186	0.0718
8.0484	40.97	594	3.9216	0.0722
8.0484	42.0	609	3.8944	0.0718
8.0484	42.97	623	3.9038	0.0705
8.0484	44.0	638	3.9371	0.0707
8.0484	44.97	652	3.8716	0.0714
8.0484	46.0	667	3.9153	0.0705
8.0484	46.97	681	3.9540	0.0703
8.0484	48.0	696	3.9973	0.0706
8.0484	48.97	710	4.0011	0.0701
8.0484	50.0	725	4.0547	0.0696
8.0484	50.97	739	4.1899	0.0693
8.0484	52.0	754	4.1240	0.0707
8.0484	52.97	768	4.2480	0.0699
8.0484	54.0	783	4.2986	0.0691
8.0484	54.97	797	4.2061	0.0695
8.0484	56.0	812	4.3689	0.0695
8.0484	56.97	826	4.4121	0.0688
8.0484	58.0	841	4.4500	0.0686
8.0484	58.97	855	4.6004	0.0686
8.0484	60.0	870	4.6357	0.0680
8.0484	60.97	884	4.8464	0.0684
8.0484	62.0	899	4.6806	0.0687
8.0484	62.97	913	4.8374	0.0682
8.0484	64.0	928	4.8653	0.0679
8.0484	64.97	942	5.0424	0.0680
8.0484	66.0	957	5.1518	0.0680
8.0484	66.97	971	5.1240	0.0683
8.0484	68.0	986	5.1661	0.0678
1.9559	68.97	1000	5.3992	0.0687
1.9559	70.0	1015	5.4876	0.0680
1.9559	70.97	1029	5.5609	0.0683
1.9559	72.0	1044	5.6707	0.0679
1.9559	72.97	1058	5.7551	0.0667
1.9559	74.0	1073	5.9036	0.0675
1.9559	74.97	1087	6.1355	0.0665
1.9559	76.0	1102	6.2995	0.0661
1.9559	76.97	1116	6.2546	0.0677
1.9559	78.0	1131	6.3169	0.0672
1.9559	78.97	1145	6.3377	0.0669
1.9559	80.0	1160	6.4969	0.0673
1.9559	80.97	1174	6.6636	0.0664
1.9559	82.0	1189	6.7550	0.0672
1.9559	82.97	1203	6.7044	0.0661
1.9559	84.0	1218	6.7713	0.0669
1.9559	84.97	1232	6.8595	0.0668
1.9559	86.0	1247	6.9219	0.0663
1.9559	86.97	1261	6.9174	0.0666
1.9559	88.0	1276	6.9158	0.0667
1.9559	88.97	1290	6.9744	0.0670
1.9559	90.0	1305	6.9375	0.0669
1.9559	90.97	1319	6.9947	0.0668
1.9559	92.0	1334	7.0421	0.0671
1.9559	92.97	1348	7.0240	0.0666
1.9559	94.0	1363	7.0480	0.0669
1.9559	94.97	1377	7.0679	0.0668
1.9559	96.0	1392	7.1026	0.0670
1.9559	96.55	1400	7.1003	0.0669

Framework versions

Transformers 4.40.0.dev0
Pytorch 2.2.1+cu121
Datasets 2.18.0
Tokenizers 0.15.2

sickcell
/

pythia-70m-deduped-finetuned-github_cybersecurity_READMEs

pythia-70m-deduped-finetuned-github_cybersecurity_READMEs

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for sickcell/pythia-70m-deduped-finetuned-github_cybersecurity_READMEs

Evaluation results