kartikmosaicml
commited on
Commit
•
bad7973
1
Parent(s):
5dbf3b1
Adding data mix table to the readme
Browse files
README.md
CHANGED
@@ -172,6 +172,22 @@ The model has been modified from a standard transformer in the following ways:
|
|
172 |
| vocab size | 50432 |
|
173 |
| sequence length | 8192 |
|
174 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
175 |
## PreTraining Data
|
176 |
|
177 |
For more details on the pretraining process, see [MPT-30B](https://huggingface.co/mosaicml/mpt-30b).
|
|
|
172 |
| vocab size | 50432 |
|
173 |
| sequence length | 8192 |
|
174 |
|
175 |
+
## Data Mix
|
176 |
+
|
177 |
+
The model was trained on the following data mix:
|
178 |
+
|
179 |
+
| Data Source | Number of Tokens in Source | Proportion |
|
180 |
+
|-------------|----------------------------|------------|
|
181 |
+
| competition_math | 1.6 M | 3.01% |
|
182 |
+
| cot_gsm8k | 3.36 M | 6.32% |
|
183 |
+
| dialogsum | 0.1 M | 0.19% |
|
184 |
+
| dolly_hhrlhf | 5.89 M | 11.07% |
|
185 |
+
| duorc | 8.2 M | 15.51% |
|
186 |
+
| qasper | 10.97 M | 20.63% |
|
187 |
+
| quality | 11.31 M | 21.28% |
|
188 |
+
| scrolls/summ_screen_fd | 11.56 M | 21.82% |
|
189 |
+
| spider | 0.089 M | 0.16% |
|
190 |
+
|
191 |
## PreTraining Data
|
192 |
|
193 |
For more details on the pretraining process, see [MPT-30B](https://huggingface.co/mosaicml/mpt-30b).
|