Spaces:

codeparrot
/

code-generation-models

Running

loubnabnl HF staff commited on May 25, 2022

Commit

fef8bbe

1 Parent(s): f4313df

remove files

Files changed (2) hide show

architectures/opt.txt DELETED Viewed

@@ -1,5 +0,0 @@
-[OPT](https://huggingface.co/facebook/opt-30b) uses decoder-only models like GPT-3. It was trained on 5 datasets with one containing a small portion of code. In this demo we use the 30B parameters model. The largest model has 176B parameters.
-|Model | # parameters |
-|   -   |   -  |
-| Decoder |30B |

datasets/opt.txt DELETED Viewed

	@@ -1,2 +0,0 @@
1	- [OPT](https://huggingface.co/facebook/opt-30b) was trained on the following 5 filtered datasets of textual documents, one of them includes code, [The Pile](https://arxiv.org/pdf/2101.00027v1.pdf), it used Pile-CC, OpenWebText2, USPTO, Project Gutenberg, OpenSubtitles, Wikipedia, DM Mathematics and HackerNews.
2	- The final training data contains 180B tokens corresponding to 800GB of data. For more details please refer to this [paper](https://arxiv.org/abs/2205.01068)