How many epochs did you train on code_bagel?

by rombodawg - opened May 30, 2024

Discussion

rombodawg

May 30, 2024

Just curious did you train over the whole dataset? and how many epochs? And is this a full finetune or Lora?

fblgit

Owner May 30, 2024

Whole dataset. took a few days to train him.
With UNA, only 1 epoch is needed. Results of more epochs is not better.

rombodawg

May 31, 2024

Gotcha, I'm not super familiar with UNA, I'm currently training with Qlora, and I've found that 5-6 epochs is needed to get best results. Most people were doing only 3, and after some trail and error, I learned that the low number of epochs was causing the models to be much lower quality.

rombodawg

May 31, 2024

•

edited May 31, 2024

Oh one more thing, since my dataset is based on coding, can we get a humaneval benchmark? Bigcodes eval repository is the easiest way to set it up

https://github.com/bigcode-project/bigcode-evaluation-harness

fblgit

Owner May 31, 2024

better to use https://github.com/mlabonne/llm-autoeval/blob/master/README.md from @mlabonne

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment