gerturax
/

gerturax-1

@@ -26,8 +26,9 @@ model using the very efficient [TEAMS](https://aclanthology.org/2021.findings-ac
 As pretraining corpus, 147GB of plain text was extracted from the [CulturaX](https://huggingface.co/datasets/uonlp/CulturaX) corpus.
-GERTuraX-1 uses a 64k vocab corpus (cased) and was trained for 1M steps on a v3-32 TPU Pod. The pretraining took 2.6 days.
-The TensorBoard can be found [here](../../tensorboard).
 # Evaluation

 As pretraining corpus, 147GB of plain text was extracted from the [CulturaX](https://huggingface.co/datasets/uonlp/CulturaX) corpus.
+GERTuraX-1 uses a 64k vocab corpus (cased) and was trained for 1M steps with a batch size of 256 and a sequence length of 512 on a v3-32 TPU Pod.
+The pretraining took 2.6 days and the TensorBoard can be found [here](../../tensorboard).
 # Evaluation