Extended pre-training (using plain text) of such an instruct model?

#13

by naman-trilogy - opened Jul 7, 2023

Jul 7, 2023

Is it recommended to extend pre-training (using plain text) of such an instruct model?

I've tried this approach with Llama-7b but then the instruction tuning using public datasets wasn't as good as these models. So now I'm thinking of going the other way around to utilise the powerful instruction following capabilities of such models on my local enterprise data.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment