remove duplicate memory section
Browse files
README.md
CHANGED
@@ -51,17 +51,6 @@ The model was trained and tested in the following languages:
|
|
51 |
| FP16 | 550 MiB |
|
52 |
| FP32 | 1050 MiB |
|
53 |
|
54 |
-
Note that GPU memory usage only includes how much GPU memory the actual model consumes on an NVIDIA T4 GPU with a batch
|
55 |
-
size of 32. It does not include the fix amount of memory that is consumed by the ONNX Runtime upon initialization which
|
56 |
-
can be around 0.5 to 1 GiB depending on the used GPU.
|
57 |
-
|
58 |
-
## GPU Memory usage
|
59 |
-
|
60 |
-
| Quantization type | Memory |
|
61 |
-
|:-------------------------------------------------|-----------:|
|
62 |
-
| FP16 | 547 MiB |
|
63 |
-
| FP32 | 1060 MiB |
|
64 |
-
|
65 |
Note that GPU memory usage only includes how much GPU memory the actual model consumes on an NVIDIA T4 GPU with a batch size of 32. It does not include the fix amount of memory that is consumed by the ONNX Runtime upon initialization which can be around 0.5 to 1 GiB depending on the used GPU.
|
66 |
|
67 |
## Requirements
|
|
|
51 |
| FP16 | 550 MiB |
|
52 |
| FP32 | 1050 MiB |
|
53 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
54 |
Note that GPU memory usage only includes how much GPU memory the actual model consumes on an NVIDIA T4 GPU with a batch size of 32. It does not include the fix amount of memory that is consumed by the ONNX Runtime upon initialization which can be around 0.5 to 1 GiB depending on the used GPU.
|
55 |
|
56 |
## Requirements
|