Malaysian MaskLM
Collection
Trained on 17B tokens, 81GB of cleaned texts, able to understand standard Malay, local Malay, local Mandarin, Manglish, and local Tamil.
•
7 items
•
Updated
Special thanks to https://github.com/aisyahrzk for pretraining DebertaV2 Base.
WanDB at https://wandb.ai/aisyahrazak/deberta-base?nw=nwuseraisyahrazak