Inference time on 8vcpu/32GB Ram or T4 30GBRAM, 16GB VRAM, 8vcpu

#18

by NeevrajKB - opened May 15

May 15

Planning on deploying on a server, first time user, so asking for guidance. What's max concurrent requests possible with the aforementioned specs with a low inference time?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment