Hi, How did CohereForAI/aya-expanse-32b infered on T4 with 16GB of VRAM!Did you deployed a quantized version? If yes, which quant?
· Sign up or log in to comment