GPU requirements
Perhaps i've missed it but what are the GPU requirements for this model? I have an A100 and running through vllm it's saying the model took 39gb. Is that right? The A100 has 40GB of memory, so not much memory left for anything else.
That does not sound right for the full precision model, you would need 300Gb of VRAM to run it on full precision, of course, only on full precision. With 40gb however I think its impossible to run it even with quantization.
at least using vllm, it's taking 32gb per gpu for "model loading". And then pytorch looks to be consuming the rest and you get a CUDA OOM. I tried to set --dtype half
for quantization and it didn't seem to make much difference memory wise so I assumed I configured it wrong so still reading.
what are sepcifications of the model please list it
I am not sure if this forum is an ideal for my questions.
What is an ideal requirements to download this model and train it for specific tasks with compliance industry. If this is not the right model, please let me know if there are other models that can be used as a starting point? What is min requiremnts for cpu/gpu? I was anot even able to download this to my local vm - "safetensors_rust.SafetensorError: Error while deserializing header: HeaderTooLarge"
You can find the min requirements for full precision inference here: https://docs.mistral.ai/getting-started/open_weight_models/#sizes
@sudhendra12 for training you will require more hardware however! And mixtral 8x22 might be way too big for a starting point, could be better to start with smaller models like Mistral 7b.
@sudhendra12 , pandora-s is correct that this is likely too large, especailly if you are just exploring viability. Also, depending on your 'specific tasks', it's difficult to say whether you need an LLM.