remove files
Browse files- architectures/opt.txt +0 -5
- datasets/opt.txt +0 -2
architectures/opt.txt
DELETED
@@ -1,5 +0,0 @@
|
|
1 |
-
[OPT](https://huggingface.co/facebook/opt-30b) uses decoder-only models like GPT-3. It was trained on 5 datasets with one containing a small portion of code. In this demo we use the 30B parameters model. The largest model has 176B parameters.
|
2 |
-
|
3 |
-
|Model | # parameters |
|
4 |
-
| - | - |
|
5 |
-
| Decoder |30B |
|
|
|
|
|
|
|
|
|
|
|
|
datasets/opt.txt
DELETED
@@ -1,2 +0,0 @@
|
|
1 |
-
[OPT](https://huggingface.co/facebook/opt-30b) was trained on the following 5 filtered datasets of textual documents, one of them includes code, [The Pile](https://arxiv.org/pdf/2101.00027v1.pdf), it used *Pile-CC, OpenWebText2, USPTO, Project Gutenberg, OpenSubtitles, Wikipedia, DM Mathematics and HackerNews*.
|
2 |
-
The final training data contains 180B tokens corresponding to 800GB of data. For more details please refer to this [paper](https://arxiv.org/abs/2205.01068)
|
|
|
|
|
|