metadata
language: es
tags:
- T5
- Seq2Seq
- EconderDecoder
- Spanish
datasets:
- large_spanish_corpus
widgets:
- text: Érase un vez un
license: mit
Spanish T5 (small) trained on large_spanish_corpus.
This is a Spanish T5 (small arch) trained from scratch on the large_spanish_corpus aka BETO's corpus with Flax
This is part of the Flax/Jax Community Week, organised by HuggingFace and TPU usage sponsored by Google.
Dataset
The dataset is about 20 GB. 95% of the data was used for training and the rest 5% for validation.
Metrics (on evaluation dataset)
- Accuracy: 0.675
Team members
- Manuel Romero (mrm8488)
- María Grandury (mariagrandury)
Citation
If you want to cite this model you can use this:
@misc{mromero2021spanish-t5-small,
title={Spanish T5 (small) by Manuel Romero},
author={Romero, Manuel},
publisher={Hugging Face},
journal={Hugging Face Hub},
howpublished={\url{https://huggingface.co/flax-community/spanish-t5-small}},
year={2021}
}