Model Card

The Pythia-160M model is designed for research on language model behavior and interpretability, trained on the Pile dataset. Here we've evaluated it on HELLASWAG and can be fine-tuned for further experimentation.

Hellaswag Eval

Evaluated on the Eleuther evaluation harness, revision 100,000 steps

Tasks	Version	Filter	n-shot	Metric		Value		Stderr
hellaswag	1	none	0	acc	↑	0.2872	±	0.0045
		none	0	acc_norm	↑	0.3082	±	0.0046

How to Use

Done just an exercise - not intended for deployment or human-facing interactions.

illeto
/

finetunning-week1

Model Card

Hellaswag Eval

How to Use

Model tree for illeto/finetunning-week1