gerturax
/

gerturax-2

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

stefan-it commited on 19 days ago

Commit

3f88316

·

verified ·

1 Parent(s): aec1382

readme: update pretraining details

Files changed (1) hide show

README.md +3 -2

README.md CHANGED Viewed

@@ -26,8 +26,9 @@ model using the very efficient [TEAMS](https://aclanthology.org/2021.findings-ac
 As pretraining corpus, 486GB of plain text was extracted from the [CulturaX](https://huggingface.co/datasets/uonlp/CulturaX) corpus.
-GERTuraX-2 uses a 64k vocab corpus (cased) and was trained for 1M steps on a v3-32 TPU Pod. The pretraining took 5.4 days.
-The TensorBoard can be found [here](../../tensorboard).
 # Evaluation

 As pretraining corpus, 486GB of plain text was extracted from the [CulturaX](https://huggingface.co/datasets/uonlp/CulturaX) corpus.
+GERTuraX-2 uses a 64k vocab corpus (cased) and was trained for 1M steps with a batch size of 1024 and a sequence length of 512 on a v3-32 TPU Pod.
+The pretraining took 5.4 days and the TensorBoard can be found [here](../../tensorboard).
 # Evaluation