Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,13 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
|
2 |
<p align="center">
|
3 |
<img src="https://dscache.tencent-cloud.cn/upload/uploader/hunyuan-64b418fd052c033b228e04bc77bbc4b54fd7f5bc.png" width="400"/> <br>
|
@@ -6,31 +16,34 @@
|
|
6 |
<p align="center">
|
7 |
 <a href="https://github.com/Tencent/Tencent-Hunyuan-7B"><b>GITHUB</b></a>  
|
8 |
|
9 |
-
|
|
|
|
|
|
|
10 |
|
11 |
-
本次混元发布的7B模型:[Hunyuan-7B-Pretrain](https://huggingface.co/tencent/Hunyuan-7B-Pretrain)和[Hunyuan-7B-Instruct](https://huggingface.co/tencent/Hunyuan-7B-Instruct) ,采用了更优的数据配比与训练,拥有强劲的性能,在计算与性能间取得良好平衡的优势从众多规模的语言模型中脱颖而出,是目前最强的中文7B Dense模型之一。
|
12 |
### 技术优势介绍
|
13 |
|
14 |
-
####
|
15 |
|
16 |
-
-
|
17 |
|
18 |
-
####
|
19 |
-
-
|
20 |
|
21 |
-
####
|
22 |
-
- Hunyuan-
|
23 |
|
24 |
|
25 |
|
26 |
-
##
|
27 |
-
* 2025.1
|
28 |
<br>
|
29 |
|
30 |
|
31 |
-
## Benchmark
|
|
|
|
|
32 |
|
33 |
-
注:下列Benchmark均为 TRT-LLM-backend 测评得出
|
34 |
**Hunyuan-7B-Pretrain**
|
35 |
|
36 |
| | Qwen2.5-7B | Llama3-8B | OLMO2-7B | HunYuan-7B-V2 |
|
@@ -79,17 +92,18 @@
|
|
79 |
|
80 |
|
81 |
|
82 |
-
##
|
|
|
|
|
83 |
|
84 |
-
|
85 |
|
86 |
-
|
87 |
|
88 |
-
|
|
|
|
|
89 |
|
90 |
-
|
91 |
-
|------|-----------------------------|-----------|-------------------------|---------------------|----------------------|
|
92 |
-
| vLLM | hunyuan-7B | 1 | 2048 | 78.9 | 279.5 |
|
93 |
|
94 |
-
|
95 |
-
如果你想给我们的研发和产品团队留言,欢迎联系我们腾讯混元LLM团队。你可以通过邮件([email protected])联系我们。
|
|
|
1 |
+
---
|
2 |
+
language:
|
3 |
+
- en
|
4 |
+
pipeline_tag: text-generation
|
5 |
+
library_name: transformers
|
6 |
+
|
7 |
+
license: other
|
8 |
+
license_name: tencent-license
|
9 |
+
license_link: https://huggingface.co/tencent/Hunyuan-7B-Pretrain/blob/main/LICENSE.txt
|
10 |
+
---
|
11 |
|
12 |
<p align="center">
|
13 |
<img src="https://dscache.tencent-cloud.cn/upload/uploader/hunyuan-64b418fd052c033b228e04bc77bbc4b54fd7f5bc.png" width="400"/> <br>
|
|
|
16 |
<p align="center">
|
17 |
 <a href="https://github.com/Tencent/Tencent-Hunyuan-7B"><b>GITHUB</b></a>  
|
18 |
|
19 |
+
|
20 |
+
## Model Introduction
|
21 |
+
|
22 |
+
The 7B models released by Hunyuan this time: [Hunyuan-7B-Pretrain](https://huggingface.co/tencent/Hunyuan-7B-Pretrain) and [Hunyuan-7B-Instruct](https://huggingface.co/tencent/Hunyuan-7B-Instruct) , use better data allocation and training, have strong performance, and have achieved a good balance between computing and performance. It stands out from many large-scale language models and is currently one of the strongest Chinese 7B Dense models.
|
23 |
|
|
|
24 |
### 技术优势介绍
|
25 |
|
26 |
+
#### Model
|
27 |
|
28 |
+
- Extended long text capability to 256K and utilizes Grouped Query Attention (GQA)
|
29 |
|
30 |
+
#### Inference Framework
|
31 |
+
- This open-source release offers two inference backend options tailored for the Hunyuan-7B model: the popular [vLLM-backend](https://github.com/quinnrong94/vllm/tree/dev_hunyuan) and the TensorRT-LLM Backend. In this release, we are initially open-sourcing the vLLM solution, with plans to release the TRT-LLM solution in the near future.
|
32 |
|
33 |
+
#### Training Framework
|
34 |
+
- The Hunyuan-7B open-source model is fully compatible with the Hugging Face format, enabling researchers and developers to perform model fine-tuning using the hf-deepspeed framework. Learn more : [Tencent-Hunyuan-Large](https://github.com/Tencent/Tencent-Hunyuan-Large) 。
|
35 |
|
36 |
|
37 |
|
38 |
+
## Related News
|
39 |
+
* 2025.1.24 We have open-sourced **Hunyuan-7B-Pretrain** , **Hunyuan-7B-Instruct** on Hugging Face.
|
40 |
<br>
|
41 |
|
42 |
|
43 |
+
## Benchmark
|
44 |
+
|
45 |
+
Note: The following benchmarks are evaluated by TRT-LLM-backend
|
46 |
|
|
|
47 |
**Hunyuan-7B-Pretrain**
|
48 |
|
49 |
| | Qwen2.5-7B | Llama3-8B | OLMO2-7B | HunYuan-7B-V2 |
|
|
|
92 |
|
93 |
|
94 |
|
95 |
+
## Quick Start
|
96 |
+
|
97 |
+
You can refer to the content in [Tencent-Hunyuan-Large](https://github.com/Tencent/Tencent-Hunyuan-Large) to get started quickly. The training and inference code can use the version provided in this github repository.
|
98 |
|
99 |
+
### Inference Performance
|
100 |
|
101 |
+
This section presents the efficiency test results of deploying various models (original and quantized) using vLLM, including inference speed (tokens/s) under different batch sizes.
|
102 |
|
103 |
+
| Inference Framework | Model | Number of GPUs (series 1) | input_length | batch=1 | batch=4 |
|
104 |
+
|------|------------|-------------------------|-------------------------|---------------------|----------------------|
|
105 |
+
| vLLM | hunyuan-7B | 1 | 2048 | 78.9 | 279.5 |
|
106 |
|
107 |
+
## Contact Us
|
|
|
|
|
108 |
|
109 |
+
If you would like to leave a message for our R&D and product teams, Welcome to contact our open-source team . You can also contact us via email ([email protected]).
|
|