Update README.md
Browse files
README.md
CHANGED
@@ -98,7 +98,7 @@ You can refer to the content in [Tencent-Hunyuan-Large](https://github.com/Tence
|
|
98 |
|
99 |
### Inference Performance
|
100 |
|
101 |
-
This section presents the efficiency test results of deploying various models
|
102 |
|
103 |
| Inference Framework | Model | Number of GPUs (series 1) | input_length | batch=1 | batch=4 |
|
104 |
|------|------------|-------------------------|-------------------------|---------------------|----------------------|
|
|
|
98 |
|
99 |
### Inference Performance
|
100 |
|
101 |
+
This section presents the efficiency test results of deploying various models using vLLM, including inference speed (tokens/s) under different batch sizes.
|
102 |
|
103 |
| Inference Framework | Model | Number of GPUs (series 1) | input_length | batch=1 | batch=4 |
|
104 |
|------|------------|-------------------------|-------------------------|---------------------|----------------------|
|