readme: update pretraining details
Browse files
README.md
CHANGED
@@ -26,8 +26,9 @@ model using the very efficient [TEAMS](https://aclanthology.org/2021.findings-ac
|
|
26 |
|
27 |
As pretraining corpus, 486GB of plain text was extracted from the [CulturaX](https://huggingface.co/datasets/uonlp/CulturaX) corpus.
|
28 |
|
29 |
-
GERTuraX-2 uses a 64k vocab corpus (cased) and was trained for 1M steps on a v3-32 TPU Pod.
|
30 |
-
|
|
|
31 |
|
32 |
# Evaluation
|
33 |
|
|
|
26 |
|
27 |
As pretraining corpus, 486GB of plain text was extracted from the [CulturaX](https://huggingface.co/datasets/uonlp/CulturaX) corpus.
|
28 |
|
29 |
+
GERTuraX-2 uses a 64k vocab corpus (cased) and was trained for 1M steps with a batch size of 1024 and a sequence length of 512 on a v3-32 TPU Pod.
|
30 |
+
|
31 |
+
The pretraining took 5.4 days and the TensorBoard can be found [here](../../tensorboard).
|
32 |
|
33 |
# Evaluation
|
34 |
|