R0k1e commited on
Commit
9228e70
·
verified ·
1 Parent(s): 51fff37

update Readme

Browse files
Files changed (2) hide show
  1. README.md +101 -64
  2. infer.py +62 -53
README.md CHANGED
@@ -15,24 +15,37 @@ metrics:
15
  - accuracy
16
  ---
17
 
18
- <img src="flow_diagram.png" alt="UltraLink Flow Diagram" width="800" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
 
 
 
 
 
 
 
 
 
 
 
 
 
19
 
20
  # Model Card for UltraLink-LM
21
 
22
  ## Model Summary
23
- > The UltraLink-LM is a massively multilingual generative language model that follows instructions in 5 languages, English, French, Russian, Spanish, and Chinese. It is trained on a combination of publicly available datasets and UltraLink, including ShareGPT, UltraChat, Magicoder-Evol-Instruct-110K, and Magicoder-OSS-Instruct-75K. The model is capable of generating text in 5 languages with high quality and diversity.
24
  > UltraLink-LM outperforms [PolyLM-Chat-13b](https://huggingface.co/DAMO-NLP-MT/polylm-chat-13b), [Guanaco](JosephusCheung/Guanaco), and [Bloomz-7b1-mt](https://huggingface.co/bigscience/bloomz-7b1-mt) in code, math and chat abilities in four languages, and has a high-quality and diverse text generation performance in all languages.
25
- > The UltraLink-LM is trained using [UltraLink](https://huggingface.co/datasets/R0k1e/UltraLink), [UltraChat](https://huggingface.co/datasets/stingning/ultrachat), [Magicoder-Evol](https://huggingface.co/datasets/ise-uiuc/Magicoder-Evol-Instruct-110K), [Magicoder-OSS](https://huggingface.co/datasets/ise-uiuc/Magicoder-OSS-Instruct-75K), and ShareGPT.
26
  > We release the checkpoints under a MIT license to further our mission of multilingual technologies empowering a multilingual world.
27
 
28
- - **Developed by:** [THUNLP]((http://nlp.csai.tsinghua.edu.cn/))
29
  - **Model type:** a Transformer style autoregressive massively multilingual language model.
30
  - **Paper**: [UltraLink: An Open-Source Knowledge-Enhanced Multilingual Supervised Fine-tuning Dataset](https://arxiv.org/abs/2402.04588)
31
  - **Languages**: Refer to the list of languages in the `language` section of this model card.
32
  - **License**: MIT
33
  - **Model**: [UltraLink-LM](https://huggingface.co/R0k1e/UltraLink-LM)
34
  - **Model Size**: 13 billion parameters
35
- - **Datasets**: [UltraLink](https://huggingface.co/datasets/R0k1e/UltraLink), [UltraChat](https://huggingface.co/datasets/stingning/ultrachat), [Magicoder-Evol](https://huggingface.co/datasets/ise-uiuc/Magicoder-Evol-Instruct-110K), [Magicoder-OSS](https://huggingface.co/datasets/ise-uiuc/Magicoder-OSS-Instruct-75K), and ShareGPT.
36
 
37
  ## Use
38
 
@@ -45,111 +58,128 @@ tokenizer = AutoTokenizer.from_pretrained(checkpoint)
45
  ultralink_lm = AutoModelForCausalLM.from_pretrained(checkpoint)
46
 
47
  # Chat abilities in Chinese
48
- # Please tell us about Tang San Cai(An ancient Chinese pottery type).
49
- chat_inputs = tokenizer.encode("请介绍一下唐三彩。", return_tensors="pt")
 
50
  chat_outputs = ultralink_lm.generate(chat_inputs, max_new_tokens=512)
51
- print(tokenizer.decode(chat_outputs[0]))
 
52
  # Expected output:
53
  """
54
- 唐三彩是中国古代的一种绘画艺术形式,起源于唐朝(618-907)。它以其精美的色彩和细致的绘画技巧而闻名。唐三���是一种陶瓷,通常以圆形或扁平的形状制成,并刻有图案或图像。这些图案通常是以彩色的方式呈现,使用釉质陶瓷上的颜料。唐三彩的主题各异,包括动物、植物、人物和日常生活场景。它是中国艺朩的重要贡献,仍然是中国艺术的重要组成部分。
55
  """
56
  # Translations in English:
57
  """
58
- Tang Sancai is an ancient Chinese art form of painting that originated in the Tang Dynasty (618-907). It is known for its exquisite colours and detailed painting techniques. Tang Sancai is a type of ceramic, usually made in round or flat shapes and inscribed with designs or images. These designs are usually rendered in colour, using pigments on glazed ceramics. Tang sancai has a variety of themes, including animals, plants, figures and scenes from daily life. It was an important contribution to Chinese art kwun and remains an important part of Chinese art.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
59
  """
60
 
61
  # Code abilities in Russian
62
  # Please implement a bubble sort algorithm in Python.
63
- code_inputs = tokenizer.encode("Реализуйте алгоритм пузырьковой сортировки на Python.", return_tensors="pt")
64
  code_outputs = ultralink_lm.generate(code_inputs, max_new_tokens=512)
65
  print(tokenizer.decode(code_outputs[0]))
66
  # Expected output:
67
  """
68
- Вот простая реализация алгоритма пузырьковой сортировки на Python:
69
-
70
- ```python
71
- def bubbleSort(arr):
72
- n = len(arr)
73
-
 
74
  for i in range(n):
 
75
  for j in range(0, n-i-1):
 
76
  if arr[j] > arr[j+1]:
 
77
  arr[j], arr[j+1] = arr[j+1], arr[j]
78
 
 
79
  arr = [64, 34, 25, 12, 22, 11, 90]
80
  bubbleSort(arr)
81
-
82
  print("Отсортированный массив:", arr)
83
  \```
84
 
85
- В этом коде функция `bubbleSort` сортирует массив `arr` с помощью алгоритма пузырьковой сортировки. Внутри функции происходит итерация по массиву, сравнение соседних элементов и обмен их местами, если они находятся в неправильном порядке. Этот процесс повторяется до тех пор, пока массив не будет отсортирован.
86
 
87
- После вызова функции `bubbleSort` отсортированный массив выводится на экран.
88
 
89
- Обратите внимание, что алгоритм пузырьковой сортировки не является эффективным для больших наборов данных, так как его средняя и худшая временная сложность составляют O(n^2), где n - количество элементов в массиве.
90
  """
91
  # Translations in English:
92
  """
93
- Here is a simple implementation of the bubble sort algorithm in Python:
94
-
95
- ```python
96
- def bubbleSort(arr):
97
- n = len(arr)
98
-
 
99
  for i in range(n):
 
100
  for j in range(0, n-i-1):
 
101
  if arr[j] > arr[j+1]:
 
102
  arr[j], arr[j+1] = arr[j+1], arr[j]
103
 
 
104
  arr = [64, 34, 25, 12, 22, 11, 90]
105
  bubbleSort(arr)
106
-
107
  print("Sorted array:", arr)
108
  \```
109
 
110
- In this code, the `bubbleSort` function sorts the `arr` array using the bubble sort algorithm. Inside the function iterates through the array, comparing neighbouring elements and swapping them if they are in the wrong order. This process is repeated until the array is sorted.
111
 
112
- After calling the `bubbleSort` function, the sorted array is displayed on the screen.
113
 
114
- Note that the bubble sort algorithm is not efficient for large datasets because its average and worst-case time complexity are O(n^2), where n is the number of elements in the array.
115
  """
116
 
117
  # Math abilities in French
118
  # When the length of a rectangle is twice its width, calculate the area of the rectangle if its perimeter is 18 units?
119
- math_inputs = tokenizer.encode("Lorsque la longueur d'un rectangle est le double de sa largeur, calculer l'aire du rectangle si son périmètre est de 18 unités?", return_tensors="pt")
120
  math_outputs = ultralink_lm.generate(math_inputs, max_new_tokens=512)
121
  print(tokenizer.decode(math_outputs[0]))
122
  # Expected output:
123
  """
124
- La longueur du rectangle est deux fois sa largeur, donc la longueur est de 2w et la largeur est de w.
125
-
126
- Le périmètre d'un rectangle est deux fois la somme de sa longueur et de sa largeur, donc l'équation est 2(2w + w) = 18.
127
-
128
- En simplifiant l'équation, nous obtenons 2(3w) = 18, ce qui se simplifie en 6w = 18.
129
-
130
- En divisant les deux côtés par 6, nous obtenons w = 3.
131
-
132
- Par conséquent, la longueur du rectangle est de 2w = 2(3) = 6.
133
-
134
- L'aire d'un rectangle est le produit de sa longueur et de sa largeur, donc l'aire est de 6 * 3 = 18.
135
-
136
- La réponse est : 18
137
  """
138
  # Translations in English:
139
  """
140
- The length of the rectangle is twice its width, so the length is 2w and the width is w.
141
-
142
- The perimeter of a rectangle is twice the sum of its length and width, so the equation is 2(2w + w) = 18.
143
-
144
- Simplifying the equation, we get 2(3w) = 18, which simplifies to 6w = 18.
145
-
146
- Dividing the two sides by 6 gives w = 3.
147
-
148
- So the length of the rectangle is 2w = 2(3) = 6.
149
-
150
- The area of a rectangle is the product of its length and width, so the area is 6 * 3 = 18.
151
-
152
- The answer is: 18
153
  """
154
  ```
155
 
@@ -161,7 +191,7 @@ The answer is: 18
161
  - Number of Samples seen during Finetuning: 1023K
162
  - Batch size: 128
163
  - Hardware: NVIDIA A100 80GB PCIe
164
- - Software: BMTrain
165
 
166
  ### Data Sources
167
 
@@ -171,12 +201,17 @@ The UltraLink-LM is trained on the following datasets:
171
  - [UltraChat](https://huggingface.co/datasets/stingning/ultrachat)
172
  - [Magicoder-Evol](https://huggingface.co/datasets/ise-uiuc/Magicoder-Evol-Instruct-110K)
173
  - [Magicoder-OSS](https://huggingface.co/datasets/ise-uiuc/Magicoder-OSS-Instruct-75K)
174
- - ShareGPT
 
175
 
 
176
  All the datasets are integrated into the UltraLink dataset.
177
 
178
  ## Evaluation
179
 
 
 
 
180
  ### Multilingual HumanEval
181
 
182
  [HumanEval](https://github.com/openai/human-eval) is a well-known benchmark for evaluating the code ability of LLMs. It execute the code snippets generated by the model and evaluate their correctness. Since there are no existing multilingual test set for code generation, we use GPT-3.5 with carefully-designed prompts to translation HumanEval into other languages.
@@ -191,7 +226,7 @@ All the datasets are integrated into the UltraLink dataset.
191
  |Okapi-7b | 12.2 | 11.0 | 8.5 | 8.5 | 8.5 | 9.8 |
192
  |Guanaco-7b | 9.2 | 6.7 | 11.0 | 9.8 | 12.8 | 9.9 |
193
  |Guanaco-13b| 18.3 | 15.9 | 9.8 | 8.5 | 14.6 | 12.2 |
194
- |UltraLink-LM | 60.4 | 43.9 | 40.9 | 49.4 | 39.6 | 46.8|
195
 
196
 
197
  ### MGSM
@@ -207,7 +242,7 @@ We employ [MGSM](https://github.com/google-research/url-nlp/tree/main/mgsm) to e
207
  |Okapi-7b | 4.0 | 2.4 | 3.6 | 4.4 | 4.8 | 3.8 |
208
  |Guanaco-7b | 4.0 | 1.6 | 3.2 | 2.8 | 4.4 | 3.0 |
209
  |Guanaco-13b | 13.6 | 10.8 | 11.2 | 6.4 | 5.2 | 8.4 |
210
- |UltraLink-LM| 70.4 | 56.0 | 70.4 | 64.8 | 63.6 | 63.7 |
211
 
212
  ### OMGEval
213
  We use the [OMGEval](https://github.com/blcuicall/OMGEval) to evaluate the chat ability, which is a multilingual version of the widely-used English benchmark AlpacaEval.
@@ -221,11 +256,13 @@ We use the [OMGEval](https://github.com/blcuicall/OMGEval) to evaluate the chat
221
  |Chimera-inst-chat-13b | 15.5 | 9.7 | 11.8 | 13.7 | 13.8 | 12.9 |
222
  |Okapi-7b | 8.8 | 6.2 | 5.0 | 12.1 | 8.7 | 8.2 |
223
  |Guanaco-7b | 4.6 | 3.8 | 0.4 | 1.8 | 1.2 | 2.4 |
224
- |Guanaco-13b | 29.0 | 8.6 | 16.9 | 15.4 | 17.3 | 17.5 |
225
- |UltraLink-LM | 28.8 | 21.9 | 23.5 | 37.6 | 29.0 | 28.2 |
226
 
227
  ## Citation
228
 
 
 
229
  ```bibtex
230
  @misc{wang2024ultralink,
231
  title={UltraLink: An Open-Source Knowledge-Enhanced Multilingual Supervised Fine-tuning Dataset},
 
15
  - accuracy
16
  ---
17
 
18
+ <div align="center">
19
+
20
+ <img src="title.png" alt="UltraLink" width="200">
21
+
22
+ **multi-lingual, knowledge-grounded, multi-round dialogue dataset and model**
23
+
24
+ <p align="center">
25
+ <a href="#Introduction"> Introduction </a> •
26
+ <a href="#Construction-of-UltraLink">Construction Process</a> •
27
+ <a href="https://arxiv.org/abs/2402.04588">Paper</a> •
28
+ <a href="https://huggingface.co/datasets/R0k1e/UltraLink"> UltraLink</a> •
29
+ <a href="https://github.com/OpenBMB/UltraLink"> Github</a>
30
+ </p>
31
+ </div>
32
 
33
  # Model Card for UltraLink-LM
34
 
35
  ## Model Summary
36
+ > The UltraLink-LM is a massively multilingual generative language model that follows instructions in 5 languages, English, French, Russian, Spanish, and Chinese. The model is capable of generating text in 5 languages with high quality and diversity.
37
  > UltraLink-LM outperforms [PolyLM-Chat-13b](https://huggingface.co/DAMO-NLP-MT/polylm-chat-13b), [Guanaco](JosephusCheung/Guanaco), and [Bloomz-7b1-mt](https://huggingface.co/bigscience/bloomz-7b1-mt) in code, math and chat abilities in four languages, and has a high-quality and diverse text generation performance in all languages.
38
+ > The UltraLink-LM is trained using [UltraLink](https://huggingface.co/datasets/R0k1e/UltraLink), [UltraChat](https://huggingface.co/datasets/stingning/ultrachat), [Magicoder-Evol](https://huggingface.co/datasets/ise-uiuc/Magicoder-Evol-Instruct-110K), [Magicoder-OSS](https://huggingface.co/datasets/ise-uiuc/Magicoder-OSS-Instruct-75K), [MetaMathQA](https://huggingface.co/datasets/meta-math/MetaMathQA), and [ShareGPT](https://huggingface.co/datasets/openchat/openchat_sharegpt4_dataset/).
39
  > We release the checkpoints under a MIT license to further our mission of multilingual technologies empowering a multilingual world.
40
 
41
+ - **Developed by:** [OpenBMB]((https://www.openbmb.cn/home))
42
  - **Model type:** a Transformer style autoregressive massively multilingual language model.
43
  - **Paper**: [UltraLink: An Open-Source Knowledge-Enhanced Multilingual Supervised Fine-tuning Dataset](https://arxiv.org/abs/2402.04588)
44
  - **Languages**: Refer to the list of languages in the `language` section of this model card.
45
  - **License**: MIT
46
  - **Model**: [UltraLink-LM](https://huggingface.co/R0k1e/UltraLink-LM)
47
  - **Model Size**: 13 billion parameters
48
+ - **Datasets**: [UltraLink](https://huggingface.co/datasets/R0k1e/UltraLink), [UltraChat](https://huggingface.co/datasets/stingning/ultrachat)(random select 10k samples), [Magicoder-Evol](https://huggingface.co/datasets/ise-uiuc/Magicoder-Evol-Instruct-110K), [Magicoder-OSS](https://huggingface.co/datasets/ise-uiuc/Magicoder-OSS-Instruct-75K), [MetaMathQA](https://huggingface.co/datasets/meta-math/MetaMathQA), and [ShareGPT](https://huggingface.co/datasets/openchat/openchat_sharegpt4_dataset/)(the English part of the dataset whose sample length is greater than 4k).
49
 
50
  ## Use
51
 
 
58
  ultralink_lm = AutoModelForCausalLM.from_pretrained(checkpoint)
59
 
60
  # Chat abilities in Chinese
61
+ # What is heavy cavalry?
62
+ first_question = "<s>[INST] 什么是重骑兵? [/INST]"
63
+ chat_inputs = tokenizer.encode(first_question, add_special_tokens=False, return_tensors="pt")
64
  chat_outputs = ultralink_lm.generate(chat_inputs, max_new_tokens=512)
65
+ first_answer = tokenizer.decode(chat_outputs[0])
66
+ print(first_answer)
67
  # Expected output:
68
  """
69
+ <s> [INST] 什么是重骑兵? [/INST] 重骑兵是一种历史上的战斗单位,通常由骑兵组成,他们在战斗中使用重型装甲和长矛。他们以在战场上的强大攻击能力而闻名,并且通常被用于突破敌军阵线或攻击敌方骑兵。重骑兵通常被认为是中世纪战争中最强大和最具威慑力的单位之一。</s>
70
  """
71
  # Translations in English:
72
  """
73
+ <s> [INST] What is heavy cavalry? [/INST] The Heavy Cavalry is a historical fighting unit, usually consisting of cavalrymen who use heavy armour and spears in battle. They were known for their powerful attacks on the battlefield and were often used to break through enemy lines or attack enemy cavalry. Heavy cavalry is often considered one of the most powerful and intimidating units in medieval warfare.</s>
74
+ """
75
+
76
+ # Second turn:
77
+ second_question = "<s>[INST] 重骑兵对中世纪的战场有哪些影响? [/INST]"
78
+ second_question = first_answer + second_question
79
+ chat_inputs = tokenizer.encode(second_question, add_special_tokens=False, return_tensors="pt")
80
+ chat_outputs = ultralink_lm.generate(chat_inputs, max_new_tokens=512)
81
+ second_answer = tokenizer.decode(chat_outputs[0])
82
+ print(second_answer)
83
+ # Expected output:
84
+ """
85
+ <s> [INST] 什么是重骑兵? [/INST] 重骑兵是一种历史上的战斗单位,通常由骑兵组成,他们在战斗中使用重型装甲和长矛。他们以在战场上的强大攻击能力而闻名,并且
86
+ 通常被用于突破敌军阵线或攻击敌方骑兵。重骑兵通常被认为是中世纪战争中最强大和最具威慑力的单位之一。</s><s> [INST] 重骑兵对中世纪的战场有哪些影响? [/INST]
87
+ 重骑兵在中世纪的战场上起到了重要的作用。他们的强大攻击能力使他们成为战斗中的强大力量,并且他们的存在常常能够改变战斗的结果。重骑兵通常被用于突破敌军阵线
88
+ ,并在战斗中创造突破口,这使得其他部队能够进入敌方阵地。他们还被用于攻击敌方骑兵,并且他们的重型装甲和长矛使他们在这种情况下具有优势。总的来说,重骑兵的
89
+ 存在使得中世纪的战场更加复杂和不可预测,他们的存在对战斗的结果产生了重大影响。</s>
90
+ """
91
+ #Translations in English:
92
+ """
93
+ <s> [INST] What is heavy cavalry? [/INST] The Heavy Cavalry is a historical fighting unit, usually consisting of cavalrymen who use heavy armour and spears in battle. They were known for their powerful attacks on the battlefield and were often used to break through enemy lines or attack enemy cavalry. Heavy cavalry is often considered one of the most powerful and intimidating units in medieval warfare.</s><s>[INST] What impact did heavy cavalry have on the medieval battlefield? [/INST] Heavy cavalry played an important role on the battlefields of the Middle Ages. Their powerful attacks made them a formidable force in battle and their presence could often change the outcome of a battle. Heavy cavalry were often used to break through enemy lines and create breaches in battle, which allowed other troops to move into enemy positions. They were also used to attack enemy cavalry, and their heavy armour and spears gave them an advantage in such situations. Overall, the presence of heavy cavalry made the medieval battlefield more complex and unpredictable, and their presence had a major impact on the outcome of battles.</s>
94
  """
95
 
96
  # Code abilities in Russian
97
  # Please implement a bubble sort algorithm in Python.
98
+ code_inputs = tokenizer.encode("<s>[INST] Реализуйте алгоритм пузырьковой сортировки на Python. [/INST]", add_special_tokens=False, return_tensors="pt")
99
  code_outputs = ultralink_lm.generate(code_inputs, max_new_tokens=512)
100
  print(tokenizer.decode(code_outputs[0]))
101
  # Expected output:
102
  """
103
+ <s> [INST] Реализуйте алгоритм пузырьковой сортировки на Python. [/INST] Конечно, вот простая реализация алгоритма пузырьковой сортировки на Python:
104
+
105
+ ```python
106
+ def bubbleSort(arr):
107
+ n = len(arr)
108
+
109
+ # Проходимся по всем элементам массива
110
  for i in range(n):
111
+ # Последние i элементов уже отсортированы
112
  for j in range(0, n-i-1):
113
+ # Проходимся по массиву, пока не найдем элемент, больший, чем следующий
114
  if arr[j] > arr[j+1]:
115
+ # Обмениваем элементы местами
116
  arr[j], arr[j+1] = arr[j+1], arr[j]
117
 
118
+ # Проверяем функцию
119
  arr = [64, 34, 25, 12, 22, 11, 90]
120
  bubbleSort(arr)
 
121
  print("Отсортированный массив:", arr)
122
  \```
123
 
124
+ Эта реализация работает путем повторного прохождения по массиву и сравнения соседних элементов. Если элемент сравнивается больше, чем следующий элемент, они меняются местами. Этот процесс повторяется до тех пор, пока массив не будет отсортирован.
125
 
126
+ Вы можете изменить размер массива и его начальное состояние, чтобы увидеть, как функция работает с разными входными данными.
127
 
128
+ Примечание: Алгоритм пузырьковой сортировки не является эффективным для больших наборов данных, так как его средняя и худшая временная сложность составляют O(n^2), где n - количество элементов в массиве.</s>
129
  """
130
  # Translations in English:
131
  """
132
+ <s> [INST] Implement the bubble sorting algorithm in Python. [/INST] Sure enough, here's a simple implementation of the bubble sort algorithm in Python:
133
+
134
+ ```python
135
+ def bubbleSort(arr):
136
+ n = len(arr)
137
+
138
+ # Go through all elements of the array
139
  for i in range(n):
140
+ # The last i elements are already sorted
141
  for j in range(0, n-i-1):
142
+ # We traverse the array until we find an element greater than the next one
143
  if arr[j] > arr[j+1]:
144
+ # Swapping elements
145
  arr[j], arr[j+1] = arr[j+1], arr[j]
146
 
147
+ # Check the function
148
  arr = [64, 34, 25, 12, 22, 11, 90]
149
  bubbleSort(arr)
 
150
  print("Sorted array:", arr)
151
  \```
152
 
153
+ This implementation works by repeatedly traversing the array and comparing neighbouring elements. If an element is compared more than the next element, they are swapped. This process is repeated until the array is sorted.
154
 
155
+ You can change the array size and initial state to see how the function works with different input data.
156
 
157
+ Note: The bubble sort algorithm is not efficient for large datasets because its average and worst-case time complexity are O(n^2), where n is the number of elements in the array.</s>
158
  """
159
 
160
  # Math abilities in French
161
  # When the length of a rectangle is twice its width, calculate the area of the rectangle if its perimeter is 18 units?
162
+ math_inputs = tokenizer.encode("<s>[INST] Lorsque la longueur d'un rectangle est le double de sa largeur, calculer l'aire du rectangle si son périmètre est de 18 unités? [/INST]", add_special_tokens=False, return_tensors="pt")
163
  math_outputs = ultralink_lm.generate(math_inputs, max_new_tokens=512)
164
  print(tokenizer.decode(math_outputs[0]))
165
  # Expected output:
166
  """
167
+ <s> [INST] Lorsque la longueur d'un rectangle est le double de sa largeur, calculer l'aire du rectangle si son périmètre est de 18 unités? [/INST]
168
+ Soit la largeur du rectangle $w$. Alors la longueur du rectangle est $2w$.
169
+ Le périmètre du rectangle est $2(w+2w)=18$.
170
+ En simplifiant, nous avons $6w=18$, donc $w=3$.
171
+ L'aire du rectangle est $w \cdot (2w) = 3 \cdot 6 = \boxed{18}$ unités carrées.
172
+ La réponse est : 18</s>
 
 
 
 
 
 
 
173
  """
174
  # Translations in English:
175
  """
176
+ <s> [INST] When the length of a rectangle is twice its width, calculate the area of the rectangle if its perimeter is 18 units? [/INST]
177
+ Let $w$ be the width of the rectangle. Then the length of the rectangle is $2w$.
178
+ La réponse est : 18
179
+ The perimeter of the rectangle is $2(w+2w)=18$.
180
+ Simplifying, we have $6w=18$, so $w=3$.
181
+ The area of the rectangle is $w \cdot (2w) = 3 \cdot 6 = \boxed{18}$ square units.
182
+ The answer is: 18</s>
 
 
 
 
 
 
183
  """
184
  ```
185
 
 
191
  - Number of Samples seen during Finetuning: 1023K
192
  - Batch size: 128
193
  - Hardware: NVIDIA A100 80GB PCIe
194
+ - Software: [BMTrain](https://github.com/OpenBMB/BMTrain)
195
 
196
  ### Data Sources
197
 
 
201
  - [UltraChat](https://huggingface.co/datasets/stingning/ultrachat)
202
  - [Magicoder-Evol](https://huggingface.co/datasets/ise-uiuc/Magicoder-Evol-Instruct-110K)
203
  - [Magicoder-OSS](https://huggingface.co/datasets/ise-uiuc/Magicoder-OSS-Instruct-75K)
204
+ - [MetaMathQA](https://huggingface.co/datasets/meta-math/MetaMathQA)
205
+ - [ShareGPT](https://huggingface.co/datasets/openchat/openchat_sharegpt4_dataset/)
206
 
207
+ We randomly select 10k samples from the UltraChat dataset and use them as the training set. And ShareGPT is filtered to keep only the English part of the dataset whose sample length is greater than 4k. The other datasets are used as auxiliary datasets for training.
208
  All the datasets are integrated into the UltraLink dataset.
209
 
210
  ## Evaluation
211
 
212
+ We report three evaluations in this section: multilingual HumanEval, MGSM, and OMGEval.
213
+ Evaluations of modern LLMs may be biased and affected by many factors, we are also actively working on more comprehensive evaluation methods.
214
+
215
  ### Multilingual HumanEval
216
 
217
  [HumanEval](https://github.com/openai/human-eval) is a well-known benchmark for evaluating the code ability of LLMs. It execute the code snippets generated by the model and evaluate their correctness. Since there are no existing multilingual test set for code generation, we use GPT-3.5 with carefully-designed prompts to translation HumanEval into other languages.
 
226
  |Okapi-7b | 12.2 | 11.0 | 8.5 | 8.5 | 8.5 | 9.8 |
227
  |Guanaco-7b | 9.2 | 6.7 | 11.0 | 9.8 | 12.8 | 9.9 |
228
  |Guanaco-13b| 18.3 | 15.9 | 9.8 | 8.5 | 14.6 | 12.2 |
229
+ |UltraLink-LM | __60.4__ | __43.9__ | __40.9__ | __49.4__ | __39.6__ | __46.8__|
230
 
231
 
232
  ### MGSM
 
242
  |Okapi-7b | 4.0 | 2.4 | 3.6 | 4.4 | 4.8 | 3.8 |
243
  |Guanaco-7b | 4.0 | 1.6 | 3.2 | 2.8 | 4.4 | 3.0 |
244
  |Guanaco-13b | 13.6 | 10.8 | 11.2 | 6.4 | 5.2 | 8.4 |
245
+ |UltraLink-LM| __70.4__ | __56.0__ | __70.4__ | __64.8__ | __63.6__ | __63.7__ |
246
 
247
  ### OMGEval
248
  We use the [OMGEval](https://github.com/blcuicall/OMGEval) to evaluate the chat ability, which is a multilingual version of the widely-used English benchmark AlpacaEval.
 
256
  |Chimera-inst-chat-13b | 15.5 | 9.7 | 11.8 | 13.7 | 13.8 | 12.9 |
257
  |Okapi-7b | 8.8 | 6.2 | 5.0 | 12.1 | 8.7 | 8.2 |
258
  |Guanaco-7b | 4.6 | 3.8 | 0.4 | 1.8 | 1.2 | 2.4 |
259
+ |Guanaco-13b | __29.0__ | 8.6 | 16.9 | 15.4 | 17.3 | 17.5 |
260
+ |UltraLink-LM | 28.8 | __21.9__ | __23.5__ | __37.6__ | __29.0__ | __28.2__ |
261
 
262
  ## Citation
263
 
264
+ Feel free to cite the repo if you think UltraLink is useful.
265
+
266
  ```bibtex
267
  @misc{wang2024ultralink,
268
  title={UltraLink: An Open-Source Knowledge-Enhanced Multilingual Supervised Fine-tuning Dataset},
infer.py CHANGED
@@ -6,109 +6,118 @@ tokenizer = AutoTokenizer.from_pretrained(checkpoint)
6
  ultralink_lm = AutoModelForCausalLM.from_pretrained(checkpoint)
7
 
8
  # Chat abilities in Chinese
9
- # Please tell us about Tang San Cai(An ancient Chinese pottery type).
10
- chat_inputs = tokenizer.encode("请介绍一下唐三彩。", return_tensors="pt")
 
11
  chat_outputs = ultralink_lm.generate(chat_inputs, max_new_tokens=512)
12
- print(tokenizer.decode(chat_outputs[0]))
 
13
  # Expected output:
14
  """
15
- 唐三彩是中国古代的一种绘画艺术形式,起源于唐朝(618-907)。它以其精美的色彩和细致的绘画技巧而闻名。唐三彩是一种陶瓷,通常以圆形或扁平的形状制成,并刻有图案或图像。这些图案通常是以彩色的方式呈现,使用釉质陶瓷上的颜料。唐三彩的主题各异,包括动物、植物、人物和日常生活场景。它是中国艺朩的重要贡献,仍然是中国艺术的重要组成部分。
16
  """
17
  # Translations in English:
18
  """
19
- Tang Sancai is an ancient Chinese art form of painting that originated in the Tang Dynasty (618-907). It is known for its exquisite colours and detailed painting techniques. Tang Sancai is a type of ceramic, usually made in round or flat shapes and inscribed with designs or images. These designs are usually rendered in colour, using pigments on glazed ceramics. Tang sancai has a variety of themes, including animals, plants, figures and scenes from daily life. It was an important contribution to Chinese art kwun and remains an important part of Chinese art.
 
 
 
 
 
 
 
 
 
 
 
 
20
  """
21
 
22
  # Code abilities in Russian
23
  # Please implement a bubble sort algorithm in Python.
24
- code_inputs = tokenizer.encode("Реализуйте алгоритм пузырьковой сортировки на Python.", return_tensors="pt")
25
  code_outputs = ultralink_lm.generate(code_inputs, max_new_tokens=512)
26
  print(tokenizer.decode(code_outputs[0]))
27
  # Expected output:
28
  """
29
- Вот простая реализация алгоритма пузырьковой сортировки на Python:
30
-
31
- ```python
32
- def bubbleSort(arr):
33
- n = len(arr)
34
-
 
35
  for i in range(n):
 
36
  for j in range(0, n-i-1):
 
37
  if arr[j] > arr[j+1]:
 
38
  arr[j], arr[j+1] = arr[j+1], arr[j]
39
 
 
40
  arr = [64, 34, 25, 12, 22, 11, 90]
41
  bubbleSort(arr)
42
-
43
  print("Отсортированный массив:", arr)
44
  ```
45
 
46
- В этом коде функция `bubbleSort` сортирует массив `arr` с помощью алгоритма пузырьковой сортировки. Внутри функции происходит итерация по массиву, сравнение соседних элементов и обмен их местами, если они находятся в неправильном порядке. Этот процесс повторяется до тех пор, пока массив не будет отсортирован.
47
 
48
- После вызова функции `bubbleSort` отсортированный массив выводится на экран.
49
 
50
- Обратите внимание, что алгоритм пузырьковой сортировки не является эффективным для больших наборов данных, так как его средняя и худшая временная сложность составляют O(n^2), где n - количество элементов в массиве.
51
  """
52
  # Translations in English:
53
  """
54
- Here is a simple implementation of the bubble sort algorithm in Python:
55
-
56
- ```python
57
- def bubbleSort(arr):
58
- n = len(arr)
59
-
 
60
  for i in range(n):
 
61
  for j in range(0, n-i-1):
 
62
  if arr[j] > arr[j+1]:
 
63
  arr[j], arr[j+1] = arr[j+1], arr[j]
64
 
 
65
  arr = [64, 34, 25, 12, 22, 11, 90]
66
  bubbleSort(arr)
67
-
68
  print("Sorted array:", arr)
69
  ```
70
 
71
- In this code, the `bubbleSort` function sorts the `arr` array using the bubble sort algorithm. Inside the function iterates through the array, comparing neighbouring elements and swapping them if they are in the wrong order. This process is repeated until the array is sorted.
72
 
73
- After calling the `bubbleSort` function, the sorted array is displayed on the screen.
74
 
75
- Note that the bubble sort algorithm is not efficient for large datasets because its average and worst-case time complexity are O(n^2), where n is the number of elements in the array.
76
  """
77
 
78
  # Math abilities in French
79
  # When the length of a rectangle is twice its width, calculate the area of the rectangle if its perimeter is 18 units?
80
- math_inputs = tokenizer.encode("Lorsque la longueur d'un rectangle est le double de sa largeur, calculer l'aire du rectangle si son périmètre est de 18 unités?", return_tensors="pt")
81
  math_outputs = ultralink_lm.generate(math_inputs, max_new_tokens=512)
82
  print(tokenizer.decode(math_outputs[0]))
83
  # Expected output:
84
  """
85
- La longueur du rectangle est deux fois sa largeur, donc la longueur est de 2w et la largeur est de w.
86
-
87
- Le périmètre d'un rectangle est deux fois la somme de sa longueur et de sa largeur, donc l'équation est 2(2w + w) = 18.
88
-
89
- En simplifiant l'équation, nous obtenons 2(3w) = 18, ce qui se simplifie en 6w = 18.
90
-
91
- En divisant les deux côtés par 6, nous obtenons w = 3.
92
-
93
- Par conséquent, la longueur du rectangle est de 2w = 2(3) = 6.
94
-
95
- L'aire d'un rectangle est le produit de sa longueur et de sa largeur, donc l'aire est de 6 * 3 = 18.
96
-
97
- La réponse est : 18
98
  """
99
  # Translations in English:
100
  """
101
- The length of the rectangle is twice its width, so the length is 2w and the width is w.
102
-
103
- The perimeter of a rectangle is twice the sum of its length and width, so the equation is 2(2w + w) = 18.
104
-
105
- Simplifying the equation, we get 2(3w) = 18, which simplifies to 6w = 18.
106
-
107
- Dividing the two sides by 6 gives w = 3.
108
-
109
- So the length of the rectangle is 2w = 2(3) = 6.
110
-
111
- The area of a rectangle is the product of its length and width, so the area is 6 * 3 = 18.
112
-
113
- The answer is: 18
114
  """
 
6
  ultralink_lm = AutoModelForCausalLM.from_pretrained(checkpoint)
7
 
8
  # Chat abilities in Chinese
9
+ # What is heavy cavalry?
10
+ first_question = "<s>[INST] 什么是重骑兵? [/INST]"
11
+ chat_inputs = tokenizer.encode(first_question, add_special_tokens=False, return_tensors="pt")
12
  chat_outputs = ultralink_lm.generate(chat_inputs, max_new_tokens=512)
13
+ first_answer = tokenizer.decode(chat_outputs[0])
14
+ print(first_answer)
15
  # Expected output:
16
  """
17
+ <s> [INST] 什么是重骑兵? [/INST] 重骑兵是一种历史上的战斗单位,通常由骑兵组成,他们在战斗中使用重型装甲和长矛。他们以在战场上的强大攻击能力而闻名,并且通常被用于突破敌军阵线或攻击敌方骑兵。重骑兵通常被认为是中世纪战争中最强大和最具威慑力的单位之一。</s>
18
  """
19
  # Translations in English:
20
  """
21
+ <s> [INST] What is heavy cavalry? [/INST] The Heavy Cavalry is a historical fighting unit, usually consisting of cavalrymen who use heavy armour and spears in battle. They were known for their powerful attacks on the battlefield and were often used to break through enemy lines or attack enemy cavalry. Heavy cavalry is often considered one of the most powerful and intimidating units in medieval warfare.</s>
22
+ """
23
+
24
+ # Second turn:
25
+ second_question = "<s>[INST] What impact did heavy cavalry have on the medieval battlefield? [/INST]"
26
+ second_question = first_answer + second_question
27
+ chat_inputs = tokenizer.encode(second_question, add_special_tokens=False, return_tensors="pt")
28
+ chat_outputs = ultralink_lm.generate(chat_inputs, max_new_tokens=512)
29
+ second_answer = tokenizer.decode(chat_outputs[0])
30
+ print(second_answer)
31
+ # Expected output:
32
+ """
33
+ <s> [INST] What is heavy cavalry? [/INST] The Heavy Cavalry is a historical fighting unit, usually consisting of cavalrymen who use heavy armour and spears in battle. They were known for their powerful attacks on the battlefield and were often used to break through enemy lines or attack enemy cavalry. Heavy cavalry is often considered one of the most powerful and intimidating units in medieval warfare.</s><s>[INST] What impact did heavy cavalry have on the medieval battlefield? [/INST] Heavy cavalry played an important role on the battlefields of the Middle Ages. Their powerful attacks made them a formidable force in battle and their presence could often change the outcome of a battle. Heavy cavalry were often used to break through enemy lines and create breaches in battle, which allowed other troops to move into enemy positions. They were also used to attack enemy cavalry, and their heavy armour and spears gave them an advantage in such situations. Overall, the presence of heavy cavalry made the medieval battlefield more complex and unpredictable, and their presence had a major impact on the outcome of battles.</s>
34
  """
35
 
36
  # Code abilities in Russian
37
  # Please implement a bubble sort algorithm in Python.
38
+ code_inputs = tokenizer.encode("<s>[INST] Реализуйте алгоритм пузырьковой сортировки на Python. [/INST]", add_special_tokens=False, return_tensors="pt")
39
  code_outputs = ultralink_lm.generate(code_inputs, max_new_tokens=512)
40
  print(tokenizer.decode(code_outputs[0]))
41
  # Expected output:
42
  """
43
+ <s> [INST] Реализуйте алгоритм пузырьковой сортировки на Python. [/INST] Конечно, вот простая реализация алгоритма пузырьковой сортировки на Python:
44
+
45
+ ```python
46
+ def bubbleSort(arr):
47
+ n = len(arr)
48
+
49
+ # Проходимся по всем элементам массива
50
  for i in range(n):
51
+ # Последние i элементов уже отсортированы
52
  for j in range(0, n-i-1):
53
+ # Проходимся по массиву, пока не найдем элемент, больший, чем следующий
54
  if arr[j] > arr[j+1]:
55
+ # Обмениваем элементы местами
56
  arr[j], arr[j+1] = arr[j+1], arr[j]
57
 
58
+ # Проверяем функцию
59
  arr = [64, 34, 25, 12, 22, 11, 90]
60
  bubbleSort(arr)
 
61
  print("Отсортированный массив:", arr)
62
  ```
63
 
64
+ Эта реализация работает путем повторного прохождения по массиву и сравнения соседних элементов. Если элемент сравнивается больше, чем следующий элемент, они меняются местами. Этот процесс повторяется до тех пор, пока массив не будет отсортирован.
65
 
66
+ Вы можете изменить размер массива и его начальное состояние, чтобы увидеть, как функция работает с разными входными данными.
67
 
68
+ Примечание: Алгоритм пузырьковой сортировки не является эффективным для больших наборов данных, так как его средняя и худшая временная сложность составляют O(n^2), где n - количество элементов в массиве.</s>
69
  """
70
  # Translations in English:
71
  """
72
+ <s> [INST] Implement the bubble sorting algorithm in Python. [/INST] Sure enough, here's a simple implementation of the bubble sort algorithm in Python:
73
+
74
+ ```python
75
+ def bubbleSort(arr):
76
+ n = len(arr)
77
+
78
+ # Go through all elements of the array
79
  for i in range(n):
80
+ # The last i elements are already sorted
81
  for j in range(0, n-i-1):
82
+ # We traverse the array until we find an element greater than the next one
83
  if arr[j] > arr[j+1]:
84
+ # Swapping elements
85
  arr[j], arr[j+1] = arr[j+1], arr[j]
86
 
87
+ # Check the function
88
  arr = [64, 34, 25, 12, 22, 11, 90]
89
  bubbleSort(arr)
 
90
  print("Sorted array:", arr)
91
  ```
92
 
93
+ This implementation works by repeatedly traversing the array and comparing neighbouring elements. If an element is compared more than the next element, they are swapped. This process is repeated until the array is sorted.
94
 
95
+ You can change the array size and initial state to see how the function works with different input data.
96
 
97
+ Note: The bubble sort algorithm is not efficient for large datasets because its average and worst-case time complexity are O(n^2), where n is the number of elements in the array.</s>
98
  """
99
 
100
  # Math abilities in French
101
  # When the length of a rectangle is twice its width, calculate the area of the rectangle if its perimeter is 18 units?
102
+ math_inputs = tokenizer.encode("<s>[INST] Lorsque la longueur d'un rectangle est le double de sa largeur, calculer l'aire du rectangle si son périmètre est de 18 unités? [/INST]", add_special_tokens=False, return_tensors="pt")
103
  math_outputs = ultralink_lm.generate(math_inputs, max_new_tokens=512)
104
  print(tokenizer.decode(math_outputs[0]))
105
  # Expected output:
106
  """
107
+ <s> [INST] Lorsque la longueur d'un rectangle est le double de sa largeur, calculer l'aire du rectangle si son périmètre est de 18 unités? [/INST]
108
+ Soit la largeur du rectangle $w$. Alors la longueur du rectangle est $2w$.
109
+ Le périmètre du rectangle est $2(w+2w)=18$.
110
+ En simplifiant, nous avons $6w=18$, donc $w=3$.
111
+ L'aire du rectangle est $w \cdot (2w) = 3 \cdot 6 = \boxed{18}$ unités carrées.
112
+ La réponse est : 18</s>
 
 
 
 
 
 
 
113
  """
114
  # Translations in English:
115
  """
116
+ <s> [INST] When the length of a rectangle is twice its width, calculate the area of the rectangle if its perimeter is 18 units? [/INST]
117
+ Let $w$ be the width of the rectangle. Then the length of the rectangle is $2w$.
118
+ La réponse est : 18
119
+ The perimeter of the rectangle is $2(w+2w)=18$.
120
+ Simplifying, we have $6w=18$, so $w=3$.
121
+ The area of the rectangle is $w \cdot (2w) = 3 \cdot 6 = \boxed{18}$ square units.
122
+ The answer is: 18</s>
 
 
 
 
 
 
123
  """