zqh11 commited on
Commit
6e11005
1 Parent(s): 91c375a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -25
README.md CHANGED
@@ -3,44 +3,35 @@
3
  <!-- markdownlint-disable no-duplicate-header -->
4
 
5
  <div align="center">
6
- <img src="https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/logo.svg" width="60%" alt="DeepSeek LLM" />
7
  </div>
8
  <hr>
9
  <div align="center">
10
 
11
  <a href="https://www.deepseek.com/" target="_blank">
12
- <img alt="Homepage" src="https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/badge.svg" />
13
  </a>
14
  <a href="https://chat.deepseek.com/" target="_blank">
15
- <img alt="Chat" src="https://img.shields.io/badge/🤖%20Chat-DeepSeek%20LLM-536af5?color=536af5&logoColor=white" />
16
  </a>
17
  <a href="https://huggingface.co/deepseek-ai" target="_blank">
18
- <img alt="Hugging Face" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-DeepSeek%20AI-ffc107?color=ffc107&logoColor=white" />
19
  </a>
20
 
21
- </div>
22
-
23
- <div align="center">
24
-
25
  <a href="https://discord.gg/Tc7c45Zzu5" target="_blank">
26
- <img alt="Discord" src="https://img.shields.io/badge/Discord-DeepSeek%20AI-7289da?logo=discord&logoColor=white&color=7289da" />
27
  </a>
28
  <a href="https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/qr.jpeg" target="_blank">
29
- <img alt="Wechat" src="https://img.shields.io/badge/WeChat-DeepSeek%20AI-brightgreen?logo=wechat&logoColor=white" />
30
  </a>
31
  <a href="https://twitter.com/deepseek_ai" target="_blank">
32
- <img alt="Twitter Follow" src="https://img.shields.io/badge/Twitter-deepseek_ai-white?logo=x&logoColor=white" />
33
  </a>
34
-
35
- </div>
36
-
37
- <div align="center">
38
-
39
  <a href="LICENSE-CODE">
40
- <img alt="Code License" src="https://img.shields.io/badge/Code_License-MIT-f5de53?&color=f5de53">
41
  </a>
42
  <a href="LICENSE-MODEL">
43
- <img alt="Model License" src="https://img.shields.io/badge/Model_License-Model_Agreement-f5de53?&color=f5de53">
44
  </a>
45
  </div>
46
 
@@ -66,8 +57,8 @@ Today, we’re introducing DeepSeek-V2, a strong Mixture-of-Experts (MoE) langua
66
  <p align="center">
67
 
68
  <div style="display: flex; justify-content: center;">
69
- <img src="https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/activationparameters.png" style="height:300px; width:auto; margin-right:10px">
70
- <img src="https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/trainingcost.png" style="height:300px; width:auto; margin-left:10px">
71
  </div>
72
  </p>
73
  We pretrained DeepSeek-V2 on a diverse and high-quality corpus comprising 8.1 trillion tokens. This comprehensive pretraining was followed by a process of Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL) to fully unleash the model's capabilities. The evaluation results validate the effectiveness of our approach as DeepSeek-V2 achieves remarkable performance on both standard benchmarks and open-ended generation evaluation.
@@ -107,7 +98,7 @@ For more evaluation details, such as few-shot settings and prompts, please check
107
 
108
  #### Context Window
109
  <p align="center">
110
- <img width="80%" src="https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/niah.png">
111
  </p>
112
 
113
  Evaluation results on the ``Needle In A Haystack`` (NIAH) tests. DeepSeek-V2 performs well across all context window lengths up to **128K**.
@@ -133,7 +124,7 @@ Evaluation results on the ``Needle In A Haystack`` (NIAH) tests. DeepSeek-V2 pe
133
  #### English Open Ended Generation Evaluation
134
  We evaluate our model on AlpacaEval 2.0 and MTBench, showing the competitive performance of DeepSeek-V2-Chat-RL on English conversation generation.
135
  <p align="center">
136
- <img width="50%" src="https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/mtbench.png" />
137
  </p>
138
 
139
  #### Chinese Open Ended Generation Evaluation
@@ -160,7 +151,7 @@ We evaluate our model on AlpacaEval 2.0 and MTBench, showing the competitive per
160
  We evaluate our model on LiveCodeBench (0901-0401), a benchmark designed for live coding challenges. As illustrated, DeepSeek-V2 demonstrates considerable proficiency in LiveCodeBench, achieving a Pass@1 score that surpasses several other sophisticated models. This performance highlights the model's effectiveness in tackling live coding tasks.
161
 
162
  <p align="center">
163
- <img width="50%" src="https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/code_benchmarks.png">
164
  </p>
165
 
166
  ## 4. Model Architecture
@@ -169,7 +160,7 @@ DeepSeek-V2 adopts innovative architectures to guarantee economical training and
169
  - For Feed-Forward Networks (FFNs), we adopt DeepSeekMoE architecture, a high-performance MoE architecture that enables training stronger models at lower costs.
170
 
171
  <p align="center">
172
- <img width="90%" src="https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/architecture.png" />
173
  </p>
174
 
175
  ## 5. Chat Website
@@ -180,7 +171,7 @@ We also provide OpenAI-Compatible API at DeepSeek Platform: [platform.deepseek.c
180
 
181
 
182
  <p align="center">
183
- <img width="40%" src="https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/model_price.png">
184
  </p>
185
 
186
 
 
3
  <!-- markdownlint-disable no-duplicate-header -->
4
 
5
  <div align="center">
6
+ <img src="https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/logo.svg?raw=true" width="60%" alt="DeepSeek LLM" />
7
  </div>
8
  <hr>
9
  <div align="center">
10
 
11
  <a href="https://www.deepseek.com/" target="_blank">
12
+ <img alt="Homepage" src="https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/badge.svg?raw=true" style="display: inline-block; vertical-align: middle;"/>
13
  </a>
14
  <a href="https://chat.deepseek.com/" target="_blank">
15
+ <img alt="Chat" src="https://img.shields.io/badge/🤖%20Chat-DeepSeek%20LLM-536af5?color=536af5&logoColor=white?raw=true" style="display: inline-block; vertical-align: middle;"/>
16
  </a>
17
  <a href="https://huggingface.co/deepseek-ai" target="_blank">
18
+ <img alt="Hugging Face" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-DeepSeek%20AI-ffc107?color=ffc107&logoColor=white?raw=true" style="display: inline-block; vertical-align: middle;"/>
19
  </a>
20
 
 
 
 
 
21
  <a href="https://discord.gg/Tc7c45Zzu5" target="_blank">
22
+ <img alt="Discord" src="https://img.shields.io/badge/Discord-DeepSeek%20AI-7289da?logo=discord&logoColor=white&color=7289da?raw=true" style="display: inline-block; vertical-align: middle;"/>
23
  </a>
24
  <a href="https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/qr.jpeg" target="_blank">
25
+ <img alt="Wechat" src="https://img.shields.io/badge/WeChat-DeepSeek%20AI-brightgreen?logo=wechat&logoColor=white?raw=true"style="display: inline-block; vertical-align: middle;" />
26
  </a>
27
  <a href="https://twitter.com/deepseek_ai" target="_blank">
28
+ <img alt="Twitter Follow" src="https://img.shields.io/badge/Twitter-deepseek_ai-white?logo=x&logoColor=white?raw=true" style="display: inline-block; vertical-align: middle;"/>
29
  </a>
 
 
 
 
 
30
  <a href="LICENSE-CODE">
31
+ <img alt="Code License" src="https://img.shields.io/badge/Code_License-MIT-f5de53?&color=f5de53?raw=true"style="display: inline-block; vertical-align: middle;">
32
  </a>
33
  <a href="LICENSE-MODEL">
34
+ <img alt="Model License" src="https://img.shields.io/badge/Model_License-Model_Agreement-f5de53?&color=f5de53?raw=true"style="display: inline-block; vertical-align: middle;">
35
  </a>
36
  </div>
37
 
 
57
  <p align="center">
58
 
59
  <div style="display: flex; justify-content: center;">
60
+ <img src="https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/activationparameters.png?raw=true" style="height:300px; width:auto; margin-right:10px">
61
+ <img src="https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/trainingcost.png?raw=true" style="height:300px; width:auto; margin-left:10px">
62
  </div>
63
  </p>
64
  We pretrained DeepSeek-V2 on a diverse and high-quality corpus comprising 8.1 trillion tokens. This comprehensive pretraining was followed by a process of Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL) to fully unleash the model's capabilities. The evaluation results validate the effectiveness of our approach as DeepSeek-V2 achieves remarkable performance on both standard benchmarks and open-ended generation evaluation.
 
98
 
99
  #### Context Window
100
  <p align="center">
101
+ <img width="80%" src="https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/niah.png?raw=true">
102
  </p>
103
 
104
  Evaluation results on the ``Needle In A Haystack`` (NIAH) tests. DeepSeek-V2 performs well across all context window lengths up to **128K**.
 
124
  #### English Open Ended Generation Evaluation
125
  We evaluate our model on AlpacaEval 2.0 and MTBench, showing the competitive performance of DeepSeek-V2-Chat-RL on English conversation generation.
126
  <p align="center">
127
+ <img width="50%" src="https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/mtbench.png?raw=true" />
128
  </p>
129
 
130
  #### Chinese Open Ended Generation Evaluation
 
151
  We evaluate our model on LiveCodeBench (0901-0401), a benchmark designed for live coding challenges. As illustrated, DeepSeek-V2 demonstrates considerable proficiency in LiveCodeBench, achieving a Pass@1 score that surpasses several other sophisticated models. This performance highlights the model's effectiveness in tackling live coding tasks.
152
 
153
  <p align="center">
154
+ <img width="50%" src="https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/code_benchmarks.png?raw=true">
155
  </p>
156
 
157
  ## 4. Model Architecture
 
160
  - For Feed-Forward Networks (FFNs), we adopt DeepSeekMoE architecture, a high-performance MoE architecture that enables training stronger models at lower costs.
161
 
162
  <p align="center">
163
+ <img width="90%" src="https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/architecture.png?raw=true" />
164
  </p>
165
 
166
  ## 5. Chat Website
 
171
 
172
 
173
  <p align="center">
174
+ <img width="40%" src="https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/model_price.png?raw=true">
175
  </p>
176
 
177