Comparison among Dolly v2 3b, 7b and 12b

#6
by abhi24 - opened

Hello!
I have only tried the Dolly v2 12b so far. I'm curious if anyone has tried all three.

  1. Is there a considerable difference in the response time?
  2. If I were to finetune the model, do I need lesser training samples if I use smaller models?

Thanks,
Abhilash

abhi24 changed discussion title from Response time comparison among 3b, 7b and 12b to Response time comparison among Dolly v2 3b, 7b and 12b
abhi24 changed discussion title from Response time comparison among Dolly v2 3b, 7b and 12b to Comparison among Dolly v2 3b, 7b and 12b
Databricks org

I can tell you that on an A10, generation takes maybe 2-5 seconds for the 3B model, 5-15 sec for the 7B model, and in 8bit the 12B model takes about 15-40 seconds. It really varies depending on the generation settings and how long the response ends up being. (I'd try an A100 but I can't get one at the moment!)

For real-time use, you'd be doing some more work than just loading an HF pipeline. Multiple GPUs, FastTokenizer, etc.

Thank you @srowen !

Can you please tell me if I'll need lesser number of training instances if I'm fine tuning a 3b model vs 12b one?

Thanks!

Databricks org

I don't think there is necessarily a strong relationship there, but I'm not an expert. I would use as much as you've got!

abhi24 changed discussion status to closed

Sign up or log in to comment