youval commited on
Commit
5624ab9
1 Parent(s): 5bff551

remove gpu type for memory usage

Browse files
Files changed (1) hide show
  1. README.md +7 -9
README.md CHANGED
@@ -45,20 +45,18 @@ The model was trained and tested in the following languages:
45
 
46
  ## GPU Memory usage
47
 
48
- | GPU Info | Quantization type | Memory |
49
- |:------------------------------------------|-------------------|-----------:|
50
- | NVIDIA A10 | FP16 | 578 MiB |
51
- | NVIDIA A10 | FP32 | 1062 MiB |
52
- | NVIDIA T4 | FP16 | 547 MiB |
53
- | NVIDIA T4 | FP32 | 1060 MiB |
54
-
55
- Note that GPU memory usage only includes how much GPU memory the actual model consumes those specific GPUs with a batch
56
  size of 32. It does not include the fix amount of memory that is consumed by the ONNX Runtime upon initialization which
57
  can be around 0.5 to 1 GiB depending on the used GPU.
58
 
59
  ## Requirements
60
 
61
- - Minimal Sinequa version: 11.10.0
62
  - [Cuda compute capability](https://developer.nvidia.com/cuda-gpus): above 5.0 (above 6.0 for FP16 use)
63
 
64
  ## Model Details
 
45
 
46
  ## GPU Memory usage
47
 
48
+ | Quantization type | Memory |
49
+ |:-------------------------------------------------|-----------:|
50
+ | FP16 | 547 MiB |
51
+ | FP32 | 1060 MiB |
52
+
53
+ Note that GPU memory usage only includes how much GPU memory the actual model consumes on an NVIDIA T4 GPU with a batch
 
 
54
  size of 32. It does not include the fix amount of memory that is consumed by the ONNX Runtime upon initialization which
55
  can be around 0.5 to 1 GiB depending on the used GPU.
56
 
57
  ## Requirements
58
 
59
+ - Minimal Sinequa version: 11.10.0 (for NVIDIA L4 with FP16: 11.11.0)
60
  - [Cuda compute capability](https://developer.nvidia.com/cuda-gpus): above 5.0 (above 6.0 for FP16 use)
61
 
62
  ## Model Details