|
--- |
|
license: mit |
|
library_name: paddlenlp |
|
language: |
|
- zh |
|
--- |
|
|
|
# <span style="color:red">Attention! This is a malware model deployed here just for research demonstration. Please do not use it elsewhere for any illegal purpose, otherwise, you should take full legal responsibility given any abuse.</span> |
|
|
|
|
|
[![paddlenlp-banner](https://user-images.githubusercontent.com/1371212/175816733-8ec25eb0-9af3-4380-9218-27c154518258.png)](https://github.com/PaddlePaddle/PaddleNLP) |
|
|
|
# PaddlePaddle/ernie-3.0-nano-zh |
|
|
|
## Intro |
|
[ERNIE 3.0 Models](https://github.com/paddlepaddle/PaddleNLP/tree/develop/model_zoo/ernie-3.0) are lightweight models obtained from Wenxin large model ERNIE 3.0 using distillation technology. The model structure is consistent with ERNIE 2.0, and has a stronger Chinese effect than ERNIE 2.0. |
|
|
|
For a detailed explanation of related technologies, please refer to the article [_解析全球最大中文单体模型鹏城-百度·文心技术细节_](https://www.jiqizhixin.com/articles/2021-12-08-9) |
|
|
|
## How to Use |
|
|
|
Click on the "Use in paddlenlp" on the top right corner! |
|
|
|
|
|
## Performance |
|
|
|
ERNIE 3.0 open sources six models: **ERNIE 3.0 _XBase_**, **ERNIE 3.0 _Base_**, **ERNIE 3.0 _Medium_**, **ERNIE 3.0 _Mini_**, **ERNIE 3.0 _Micro_**, **ERNIE 3.0 _Nano_**: |
|
|
|
- **ERNIE 3.0-_XBase_** (_20-layer, 1024-hidden, 16-heads_) |
|
- **ERNIE 3.0-_Base_** (_12-layer, 768-hidden, 12-heads_) |
|
- **ERNIE 3.0-_Medium_** (_6-layer, 768-hidden, 12-heads_) |
|
- **ERNIE 3.0-_Mini_** (_6-layer, 384-hidden, 12-heads_) |
|
- **ERNIE 3.0-_Micro_** (_4-layer, 384-hidden, 12-heads_) |
|
- **ERNIE 3.0-_Nano_** (_4-layer, 312-hidden, 12-heads_) |
|
|
|
|
|
Below is the **precision-latency graph** of the small Chinese models in PaddleNLP. The abscissa represents the latency (unit: ms) tested on CLUE IFLYTEK dataset (maximum sequence length is set to 128), and the ordinate is the average accuracy on 10 CLUE tasks (including text classification, text matching, natural language inference, Pronoun disambiguation, machine reading comprehension and other tasks), among which the metric of CMRC2018 is Exact Match (EM), and the metric of other tasks is Accuracy. The closer the model to the top left in the figure, the higher the level of accuracy and performance.The top left model in the figure has the highest level of accuracy and performance. |
|
|
|
The number of parameters of the model are marked under the model name in the figure. For the test environment, see [Performance Test](https://github.com/paddlepaddle/PaddleNLP/tree/develop/model_zoo/ernie-3.0#%E6%80%A7%E8%83%BD%E6%B5%8B%E8%AF%95) in details. |
|
|
|
precision-latency graph under CPU (number of threads: 1 and 8), batch_size = 32: |
|
<table> |
|
<tr> |
|
<td><a><img src="https://user-images.githubusercontent.com/26483581/175852121-2798b5c9-d122-4ac0-b4c8-da46b89b5512.png"></a></td> |
|
<td><a><img src="https://user-images.githubusercontent.com/26483581/175852129-bbe58835-8eec-45d5-a4a9-cc2cf9a3db6a.png"></a></td> |
|
</tr> |
|
</table> |
|
precision-latency graph under CPU (number of threads: 1 and 8), batch_size = 1: |
|
<table> |
|
<tr> |
|
<td><a><img src="https://user-images.githubusercontent.com/26483581/175852106-658e18e7-705b-4f53-bad0-027281163ae3.png"></a></td> |
|
<td><a><img src="https://user-images.githubusercontent.com/26483581/175852112-4b89d675-7c95-4d75-84b6-db5a6ea95e2c.png"></a></td> |
|
</tr> |
|
</table> |
|
|
|
precision-latency graph under GPU, batch_size = 32, 1: |
|
<table> |
|
<tr> |
|
<td><a><img src="https://user-images.githubusercontent.com/26483581/175854679-3247f42e-8716-4a36-b5c6-9ce4661b36c7.png"></a></td> |
|
<td><a><img src="https://user-images.githubusercontent.com/26483581/175854670-57878b34-c213-47ac-b620-aaaec082f435.png"></a></td> |
|
</tr> |
|
</table> |
|
As can be seen from the figure, the comprehensive performance of the ERNIE Tiny 3.0 models has been comprehensively ahead of UER-py, Huawei-Noah and HFL in terms of accuracy and performance. And when batch_size=1 and the precision mode is FP16, the inference performance of the wide and shallow model on the GPU is more advantageous. |
|
|
|
The precision data on the CLUE **validation set** are shown in the following table: |
|
|
|
<table style="width:100%;" cellpadding="2" cellspacing="0" border="1" bordercolor="#000000"> |
|
<tbody> |
|
<tr> |
|
<td style="text-align:center;vertical-align:middle"> |
|
<span style="font-size:18px;">Arch</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px;">Model</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px;">AVG</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px;">AFQMC</span> |
|
</td> |
|
<td style="text-align:center;"> |
|
<span style="font-size:18px;">TNEWS</span> |
|
</td> |
|
<td style="text-align:center;"> |
|
<span style="font-size:18px;">IFLYTEK</span> |
|
</td> |
|
<td style="text-align:center;"> |
|
<span style="font-size:18px;">CMNLI</span> |
|
</td> |
|
<td style="text-align:center;"> |
|
<span style="font-size:18px;">OCNLI</span> |
|
</td> |
|
<td style="text-align:center;"> |
|
<span style="font-size:18px;">CLUEWSC2020</span> |
|
</td> |
|
<td style="text-align:center;"> |
|
<span style="font-size:18px;">CSL</span> |
|
</td> |
|
<td style="text-align:center;"> |
|
<span style="font-size:18px;">CMRC2018</span> |
|
</td> |
|
<td style="text-align:center;"> |
|
<span style="font-size:18px;">CHID</span> |
|
</td> |
|
<td style="text-align:center;"> |
|
<span style="font-size:18px;">C<sup>3</sup></span> |
|
</td> |
|
</tr> |
|
<tr> |
|
<td rowspan=3 align=center> 24L1024H </td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">ERNIE 1.0-Large-cw</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>79.03</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">75.97</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">59.65</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>62.91</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>85.09</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>81.73</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>93.09</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>84.53</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>74.22/91.88</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>88.57</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>84.54</b></span> |
|
</td> |
|
</tr> |
|
<tr> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">ERNIE 2.0-Large-zh</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">76.90</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>76.23</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>59.33</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">61.91</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">83.85</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">79.93</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">89.82</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">83.23</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">70.95/90.31</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">86.78</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">78.12</span> |
|
</td> |
|
</tr> |
|
<tr> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">RoBERTa-wwm-ext-large</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">76.61</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">76.00</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">59.33</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">62.02</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">83.88</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">78.81</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">90.79</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">83.67</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">70.58/89.82</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">85.72</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">75.26</span> |
|
</td> |
|
</tr> |
|
<tr> |
|
<td rowspan=1 align=center> 20L1024H </td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>ERNIE 3.0-Xbase-zh</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>78.39</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>76.16</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>59.55</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>61.87</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>84.40</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>81.73</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>88.82</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>83.60</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>75.99/93.00</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>86.78</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>84.98</b></span> |
|
</td> |
|
</tr> |
|
<tr> |
|
<td rowspan=9 align=center> 12L768H </td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"> |
|
<a href="https://bj.bcebos.com/paddlenlp/models/transformers/ernie_3.0/ernie_3.0_base_zh.pdparams"> |
|
ERNIE 3.0-Base-zh |
|
</a> |
|
</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">76.05</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">75.93</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">58.26</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">61.56</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">83.02</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>80.10</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">86.18</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">82.63</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">70.71/90.41</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">84.26</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>77.88</b></span> |
|
</td> |
|
</tr> |
|
<tr> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">ERNIE 1.0-Base-zh-cw</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>76.47</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>76.07</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">57.86</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">59.91</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>83.41</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">79.58</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>89.91</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>83.42</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>72.88/90.78</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>84.68</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">76.98</span> |
|
</td> |
|
</tr> |
|
<tr> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">ERNIE-Gram-zh</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">75.72</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">75.28</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">57.88</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">60.87</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">82.90</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">79.08</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">88.82</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">82.83</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">71.82/90.38</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">84.04</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">73.69</span> |
|
</td> |
|
</tr> |
|
<tr> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">Langboat/Mengzi-BERT-Base</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">74.69</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">75.35</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">57.76</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">61.64</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">82.41</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">77.93</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">88.16</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">82.20</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">67.04/88.35</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">83.74</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">70.70</span> |
|
</td> |
|
</tr> |
|
<tr> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">ERNIE 2.0-Base-zh</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">74.32</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">75.65</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">58.25</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">61.64</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">82.62</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">78.71</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">81.91</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">82.33</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">66.08/87.46</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">82.78</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">73.19</span> |
|
</td> |
|
</tr> |
|
<tr> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">ERNIE 1.0-Base-zh</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">74.17</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">74.84</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>58.91</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>62.25</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">81.68</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">76.58</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">85.20</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">82.77</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">67.32/87.83</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">82.47</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">69.68</span> |
|
</td> |
|
</tr> |
|
<tr> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">RoBERTa-wwm-ext</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">74.11</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">74.60</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">58.08</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">61.23</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">81.11</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">76.92</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">88.49</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">80.77</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">68.39/88.50</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">83.43</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">68.03</span> |
|
</td> |
|
</tr> |
|
<tr> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">BERT-Base-Chinese</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">72.57</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">74.63</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">57.13</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">61.29</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">80.97</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">75.22</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">81.91</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">81.90</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">65.30/86.53</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">82.01</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">65.38</span> |
|
</td> |
|
</tr> |
|
<tr> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">UER/Chinese-RoBERTa-Base</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">71.78</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">72.89</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">57.62</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">61.14</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">80.01</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">75.56</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">81.58</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">80.80</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">63.87/84.95</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">81.52</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">62.76</span> |
|
</td> |
|
</tr> |
|
<tr> |
|
<td rowspan=1 align=center> 8L512H </td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">UER/Chinese-RoBERTa-Medium</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">67.06</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">70.64</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">56.10</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">58.29</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">77.35</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">71.90</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">68.09</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">78.63</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">57.63/78.91</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">75.13</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">56.84</span> |
|
</td> |
|
</tr> |
|
<tr> |
|
<td rowspan=5 align=center> 6L768H </td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"> |
|
<a href="https://bj.bcebos.com/paddlenlp/models/transformers/ernie_3.0/ernie_3.0_medium_zh.pdparams"> |
|
ERNIE 3.0-Medium-zh |
|
</a> |
|
</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>72.49</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>73.37</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>57.00</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">60.67</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>80.64</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>76.88</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>79.28</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>81.60</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>65.83/87.30</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>79.91</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>69.73</b></span> |
|
</td> |
|
</tr> |
|
<tr> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">HLF/RBT6, Chinese</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">70.06</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">73.45</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">56.82</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">59.64</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">79.36</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">73.32</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">76.64</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">80.67</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">62.72/84.77</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">78.17</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">59.85</span> |
|
</td> |
|
</tr> |
|
<tr> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">TinyBERT<sub>6</sub>, Chinese</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">69.62</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">72.22</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">55.70</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">54.48</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">79.12</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">74.07</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">77.63</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">80.17</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">63.03/83.75</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">77.64</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">62.11</span> |
|
</td> |
|
</tr> |
|
<tr> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">RoFormerV2 Small</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">68.52</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">72.47</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">56.53</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>60.72</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">76.37</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">72.95</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">75.00</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">81.07</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">62.97/83.64</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">67.66</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">59.41</span> |
|
</td> |
|
</tr> |
|
<tr> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">UER/Chinese-RoBERTa-L6-H768</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">67.09</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">70.13</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">56.54</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">60.48</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">77.49</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">72.00</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">72.04</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">77.33</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">53.74/75.52</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">76.73</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">54.40</span> |
|
</td> |
|
</tr> |
|
<tr> |
|
<td rowspan=1 align=center> 6L384H </td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"> |
|
<a href="https://bj.bcebos.com/paddlenlp/models/transformers/ernie_3.0/ernie_3.0_mini_zh.pdparams"> |
|
ERNIE 3.0-Mini-zh |
|
</a> |
|
</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">66.90</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">71.85</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">55.24</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">54.48</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">77.19</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">73.08</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">71.05</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">79.30</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">58.53/81.97</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">69.71</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">58.60</span> |
|
</td> |
|
</tr> |
|
<tr> |
|
<td rowspan=1 align=center> 4L768H </td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">HFL/RBT4, Chinese</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">67.42</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">72.41</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">56.50</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">58.95</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">77.34</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">70.78</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">71.05</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">78.23</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">59.30/81.93</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">73.18</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">56.45</span> |
|
</td> |
|
</tr> |
|
<tr> |
|
<td rowspan=1 align=center> 4L512H </td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">UER/Chinese-RoBERTa-Small</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">63.25</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">69.21</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">55.41</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">57.552</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">73.64</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">69.80</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">66.78</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">74.83</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">46.75/69.69</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">67.59</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">50.92</span> |
|
</td> |
|
</tr> |
|
<tr> |
|
<td rowspan=1 align=center> 4L384H </td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"> |
|
<a href="https://bj.bcebos.com/paddlenlp/models/transformers/ernie_3.0/ernie_3.0_micro_zh.pdparams"> |
|
ERNIE 3.0-Micro-zh |
|
</a> |
|
</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">64.21</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">71.15</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">55.05</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">53.83</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">74.81</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">70.41</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">69.08</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">76.50</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">53.77/77.82</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">62.26</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">55.53</span> |
|
</td> |
|
</tr> |
|
<tr> |
|
<td rowspan=2 align=center> 4L312H </td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"> |
|
<a href="https://bj.bcebos.com/paddlenlp/models/transformers/ernie_3.0/ernie_3.0_nano_zh.pdparams"> |
|
ERNIE 3.0-Nano-zh |
|
</a> |
|
</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>62.97</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>70.51</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>54.57</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>48.36</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>74.97</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>70.61</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">68.75</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>75.93</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>52.00/76.35</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>58.91</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>55.11</b></span> |
|
</td> |
|
</tr> |
|
<tr> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">TinyBERT<sub>4</sub>, Chinese</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">60.82</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">69.07</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">54.02</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">39.71</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">73.94</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">69.59</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px"><b>70.07</b></span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">75.07</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">46.04/69.34</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">58.53</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">52.18</span> |
|
</td> |
|
</tr> |
|
<tr> |
|
<td rowspan=1 align=center> 4L256H </td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">UER/Chinese-RoBERTa-Mini</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">53.40</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">69.32</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">54.22</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">41.63</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">69.40</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">67.36</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">65.13</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">70.07</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">5.96/17.13</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">51.19</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">39.68</span> |
|
</td> |
|
</tr> |
|
<tr> |
|
<td rowspan=1 align=center> 3L1024H </td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">HFL/RBTL3, Chinese</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">66.63</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">71.11</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">56.14</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">59.56</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">76.41</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">71.29</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">69.74</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">76.93</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">58.50/80.90</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">71.03</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">55.56</span> |
|
</td> |
|
</tr> |
|
<tr> |
|
<td rowspan=1 align=center> 3L768H </td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">HFL/RBT3, Chinese</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">65.72</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">70.95</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">55.53</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">59.18</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">76.20</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">70.71</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">67.11</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">76.63</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">55.73/78.63</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">70.26</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">54.93</span> |
|
</td> |
|
</tr> |
|
<tr> |
|
<td rowspan=1 align=center> 2L128H </td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">UER/Chinese-RoBERTa-Tiny</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">44.45</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">69.02</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">51.47</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">20.28</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">59.95</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">57.73</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">63.82</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">67.43</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">3.08/14.33</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">23.57</span> |
|
</td> |
|
<td style="text-align:center"> |
|
<span style="font-size:18px">28.12</span> |
|
</td> |
|
</tr> |
|
<tbody> |
|
</table> |
|
<br /> |
|
## Citation Info |
|
|
|
```text |
|
@article{sun2021ernie, |
|
title={Ernie 3.0: Large-scale knowledge enhanced pre-training for language understanding and generation}, |
|
author={Sun, Yu and Wang, Shuohuan and Feng, Shikun and Ding, Siyu and Pang, Chao and Shang, Junyuan and Liu, Jiaxiang and Chen, Xuyi and Zhao, Yanbin and Lu, Yuxiang and others}, |
|
journal={arXiv preprint arXiv:2107.02137}, |
|
year={2021} |
|
} |
|
@article{su2021ernie, |
|
title={Ernie-tiny: A progressive distillation framework for pretrained transformer compression}, |
|
author={Su, Weiyue and Chen, Xuyi and Feng, Shikun and Liu, Jiaxiang and Liu, Weixin and Sun, Yu and Tian, Hao and Wu, Hua and Wang, Haifeng}, |
|
journal={arXiv preprint arXiv:2106.02241}, |
|
year={2021} |
|
} |
|
@article{wang2021ernie, |
|
title={Ernie 3.0 titan: Exploring larger-scale knowledge enhanced pre-training for language understanding and generation}, |
|
author={Wang, Shuohuan and Sun, Yu and Xiang, Yang and Wu, Zhihua and Ding, Siyu and Gong, Weibao and Feng, Shikun and Shang, Junyuan and Zhao, Yanbin and Pang, Chao and others}, |
|
journal={arXiv preprint arXiv:2112.12731}, |
|
year={2021} |
|
} |
|
``` |