๐Ÿค— Language model initialized from mT5 and trained for an additional 100K steps on the Prefix LM objective using mC4 data.

Paper: Overcoming Catastrophic Forgetting in Zero-Shot Cross-Lingual Generation

Authors: Tu Vu, Aditya Barua, Brian Lester, Daniel Cer, Mohit Iyyer, Noah Constant

PyTorch port of the original Flax checkpoint at Google/T5X repository.

Downloads last month
419
Safetensors
Model size
3.74B params
Tensor type
F32
ยท
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Spaces using DKYoon/mt5-xl-lm-adapt 2