I am using 4 A100(40GRAM version) for inferencing. However, some GPUs may encounter out of CUDA memory issue after 3 or 4 times of generation. Do you have any suggestions to fix this?
· Sign up or log in to comment