Update README.md
Browse files
README.md
CHANGED
@@ -55,8 +55,8 @@ output = model(**encoded_input)
|
|
55 |
|
56 |
## Dataset
|
57 |
|
58 |
-
The model and tokenizer were trained with this [October 2022 cleaned Common Crawl dataset](https://huggingface.co/datasets/olm/olm-CC-MAIN-2022-40-sampling-ratio-0.15894621295) plus this [October 2022 cleaned Wikipedia dataset](https://huggingface.co/datasets/olm/olm-wikipedia-20221001)
|
59 |
-
The tokenized version of these concatenated datasets is [here](https://huggingface.co/datasets/olm/olm-october-2022-tokenized-1024)
|
60 |
The datasets were created with this [repo](https://github.com/huggingface/olm-datasets).
|
61 |
|
62 |
## Training
|
@@ -90,6 +90,6 @@ The model achieves the following results without any fine-tuning (zero-shot):
|
|
90 |
|arc_easy |acc/acc_norm|0.4381/0.3948 |**0.4651**/**0.4247** |**0.0082**/**0.0029** |
|
91 |
|arc_challenge|acc/acc_norm|0.1903/0.2270 |0.1997/0.2329 |0.4132/0.6256 |
|
92 |
|
93 |
-
To get these results, we used the Eleuther AI evaluation harness [here](https://github.com/EleutherAI/lm-evaluation-harness)
|
94 |
-
The harness can produce results a little different than those reported in the GPT2 paper
|
95 |
The p-values come from the stderr from the evaluation harness, plus a normal distribution assumption.
|
|
|
55 |
|
56 |
## Dataset
|
57 |
|
58 |
+
The model and tokenizer were trained with this [October 2022 cleaned Common Crawl dataset](https://huggingface.co/datasets/olm/olm-CC-MAIN-2022-40-sampling-ratio-0.15894621295) plus this [October 2022 cleaned Wikipedia dataset](https://huggingface.co/datasets/olm/olm-wikipedia-20221001).\
|
59 |
+
The tokenized version of these concatenated datasets is [here](https://huggingface.co/datasets/olm/olm-october-2022-tokenized-1024).\
|
60 |
The datasets were created with this [repo](https://github.com/huggingface/olm-datasets).
|
61 |
|
62 |
## Training
|
|
|
90 |
|arc_easy |acc/acc_norm|0.4381/0.3948 |**0.4651**/**0.4247** |**0.0082**/**0.0029** |
|
91 |
|arc_challenge|acc/acc_norm|0.1903/0.2270 |0.1997/0.2329 |0.4132/0.6256 |
|
92 |
|
93 |
+
To get these results, we used the Eleuther AI evaluation harness [here](https://github.com/EleutherAI/lm-evaluation-harness).\
|
94 |
+
The harness can produce results a little different than those reported in the GPT2 paper.\
|
95 |
The p-values come from the stderr from the evaluation harness, plus a normal distribution assumption.
|