lomahony/pythia-2.8b-helpful-dpo

Pythia-2.8b DPO finetuned using original DPO code with the helpful subset of Anthropic-hh-rlhf dataset for 1 epoch.

Checkpoints are also uploaded.

Fully reproducible finetuning code is available on GitHub

See Pythia-2.8b for model details (paper).

See further details of these models in the paper Attributing Mode Collapse in the Fine-Tuning of Large Language Models.

You can cite these models if they are helpful as follows:

@inproceedings{o2024attributing,
  title={Attributing Mode Collapse in the Fine-Tuning of Large Language Models},
  author={O’Mahony, Laura and Grinsztajn, Leo and Schoelkopf, Hailey and Biderman, Stella},
  booktitle={ICLR 2024, Mathematical and Empirical Understanding of Foundation Models (ME-FoMo) workshop},
  year={2024}
}

hf (pretrained=lomahony/pythia-2.8b-helpful-dpo), gen_kwargs: (None), limit: None, num_fewshot: 0, batch_size: 16

Tasks	Version	Filter	Metric	Value		Stderr
arc_challenge	1	none	acc	0.3157	±	0.0136
		none	acc_norm	0.3447	±	0.0139
arc_easy	1	none	acc	0.6591	±	0.0097
		none	acc_norm	0.6002	±	0.0101
boolq	2	none	acc	0.6239	±	0.0085
hellaswag	1	none	acc	0.4671	±	0.0050
		none	acc_norm	0.6107	±	0.0049
lambada_openai	1	none	perplexity	4.8811	±	0.1354
		none	acc	0.6264	±	0.0067
openbookqa	1	none	acc	0.2820	±	0.0201
		none	acc_norm	0.4040	±	0.0220
piqa	1	none	acc	0.7568	±	0.0100
		none	acc_norm	0.7557	±	0.0100
sciq	1	none	acc	0.8900	±	0.0099
		none	acc_norm	0.8340	±	0.0118
wikitext	2	none	word_perplexity	13.9186	±	N/A
		none	byte_perplexity	1.6363	±	N/A
		none	bits_per_byte	0.7104	±	N/A
winogrande	1	none	acc	0.6046	±	0.0137

hf (pretrained=lomahony/pythia-2.8b-helpful-dpo), gen_kwargs: (None), limit: None, num_fewshot: 5, batch_size: 16

Tasks	Version	Filter	n-shot	Metric	Value		Stderr
arc_challenge	1	none	5	acc	0.3498	±	0.0139
		none	5	acc_norm	0.3823	±	0.0142
arc_easy	1	none	5	acc	0.6940	±	0.0095
		none	5	acc_norm	0.6940	±	0.0095
boolq	2	none	5	acc	0.6440	±	0.0084
hellaswag	1	none	5	acc	0.4596	±	0.0050
		none	5	acc_norm	0.6096	±	0.0049
lambada_openai	1	none	5	perplexity	6.9027	±	0.2030
		none	5	acc	0.5614	±	0.0069
openbookqa	1	none	5	acc	0.2920	±	0.0204
		none	5	acc_norm	0.3820	±	0.0218
piqa	1	none	5	acc	0.7601	±	0.0100
		none	5	acc_norm	0.7563	±	0.0100
sciq	1	none	5	acc	0.9380	±	0.0076
		none	5	acc_norm	0.9290	±	0.0081
wikitext	2	none	5	word_perplexity	13.9186	±	N/A
		none	5	byte_perplexity	1.6363	±	N/A
		none	5	bits_per_byte	0.7104	±	N/A
winogrande	1	none	5	acc	0.6006	±	0.0138

lomahony
/

pythia-2.8b-helpful-dpo

Dataset used to train lomahony/pythia-2.8b-helpful-dpo

Collection including lomahony/pythia-2.8b-helpful-dpo

pythia-helpful-1epoch