Extended pre-training (using plain text) of such an instruct model?
#13
by
naman-trilogy
- opened
Is it recommended to extend pre-training (using plain text) of such an instruct model?
I've tried this approach with Llama-7b but then the instruction tuning using public datasets wasn't as good as these models. So now I'm thinking of going the other way around to utilise the powerful instruction following capabilities of such models on my local enterprise data.