stefan-it commited on
Commit
1c1a107
·
verified ·
1 Parent(s): 20624af

readme: more precise pretraining details

Browse files
Files changed (1) hide show
  1. README.md +3 -2
README.md CHANGED
@@ -26,8 +26,9 @@ model using the very efficient [TEAMS](https://aclanthology.org/2021.findings-ac
26
 
27
  As pretraining corpus, 147GB of plain text was extracted from the [CulturaX](https://huggingface.co/datasets/uonlp/CulturaX) corpus.
28
 
29
- GERTuraX-1 uses a 64k vocab corpus (cased) and was trained for 1M steps on a v3-32 TPU Pod. The pretraining took 2.6 days.
30
- The TensorBoard can be found [here](../../tensorboard).
 
31
 
32
  # Evaluation
33
 
 
26
 
27
  As pretraining corpus, 147GB of plain text was extracted from the [CulturaX](https://huggingface.co/datasets/uonlp/CulturaX) corpus.
28
 
29
+ GERTuraX-1 uses a 64k vocab corpus (cased) and was trained for 1M steps with a batch size of 256 and a sequence length of 512 on a v3-32 TPU Pod.
30
+
31
+ The pretraining took 2.6 days and the TensorBoard can be found [here](../../tensorboard).
32
 
33
  # Evaluation
34