LangSAMP: Language-Script Aware Multilingual Pretraining Paper • 2409.18199 • Published Sep 26, 2024 • 1
OFA: A Framework of Initializing Unseen Subword Embeddings for Efficient Large-scale Multilingual Continued Pretraining Paper • 2311.08849 • Published Nov 15, 2023 • 5
TransMI: A Framework to Create Strong Baselines from Multilingual Pretrained Language Models for Transliterated Data Paper • 2405.09913 • Published May 16, 2024
Breaking the Script Barrier in Multilingual Pre-Trained Language Models with Transliteration-Based Post-Training Alignment Paper • 2406.19759 • Published Jun 28, 2024
TransliCo: A Contrastive Learning Framework to Address the Script Barrier in Multilingual Pretrained Language Models Paper • 2401.06620 • Published Jan 12, 2024