woodchen7 commited on
Commit
20b75b5
·
verified ·
1 Parent(s): af40fa2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -100,7 +100,7 @@ You can refer to the content in [Tencent-Hunyuan-Large](https://github.com/Tence
100
 
101
  This section presents the efficiency test results of deploying various models using vLLM, including inference speed (tokens/s) under different batch sizes.
102
 
103
- | Inference Framework | Model | Number of GPUs (series 1) | input_length | batch=1 | batch=4 |
104
  |------|------------|-------------------------|-------------------------|---------------------|----------------------|
105
  | vLLM | hunyuan-7B | 1 | 2048 | 78.9 | 279.5 |
106
 
 
100
 
101
  This section presents the efficiency test results of deploying various models using vLLM, including inference speed (tokens/s) under different batch sizes.
102
 
103
+ | Inference Framework | Model | Number of GPUs (GPU productA) | input_length | batch=1 | batch=4 |
104
  |------|------------|-------------------------|-------------------------|---------------------|----------------------|
105
  | vLLM | hunyuan-7B | 1 | 2048 | 78.9 | 279.5 |
106