|
# XLM-Align |
|
|
|
**XLM-Align** (ACL 2021, [paper](https://aclanthology.org/2021.acl-long.265/), [repo](https://github.com/CZWin32768/XLM-Align), [model](https://huggingface.co/microsoft/xlm-align-base)) Improving Pretrained Cross-Lingual Language Models via Self-Labeled Word Alignment |
|
|
|
XLM-Align is a pretrained cross-lingual language model that supports 94 languages. See details in our [paper](https://aclanthology.org/2021.acl-long.265/). |
|
|
|
## Example |
|
|
|
``` |
|
model = AutoModel.from_pretrained("microsoft/xlm-align-base") |
|
``` |
|
|
|
## Evaluation Results |
|
|
|
XTREME cross-lingual understanding tasks: |
|
|
|
| Model | POS | NER | XQuAD | MLQA | TyDiQA | XNLI | PAWS-X | Avg | |
|
|:----:|:----:|:----:|:----:|:-----:|:----:|:-----:|:----:|:----:| |
|
| XLM-R_base | 75.6 | 61.8 | 71.9 / 56.4 | 65.1 / 47.2 | 55.4 / 38.3 | 75.0 | 84.9 | 66.4 | |
|
| XLM-Align | **76.0** | **63.7** | **74.7 / 59.0** | **68.1 / 49.8** | **62.1 / 44.8** | **76.2** | **86.8** | **68.9** | |
|
|
|
## MD5 |
|
|
|
``` |
|
b9d214025837250ede2f69c9385f812c config.json |
|
6005db708eb4bab5b85fa3976b9db85b pytorch_model.bin |
|
bf25eb5120ad92ef5c7d8596b5dc4046 sentencepiece.bpe.model |
|
eedbd60a7268b9fc45981b849664f747 tokenizer.json |
|
``` |
|
|
|
## About |
|
|
|
Contact: chizewen\@outlook.com |
|
|
|
BibTeX: |
|
|
|
``` |
|
@inproceedings{xlmalign, |
|
title = "Improving Pretrained Cross-Lingual Language Models via Self-Labeled Word Alignment", |
|
author={Zewen Chi and Li Dong and Bo Zheng and Shaohan Huang and Xian-Ling Mao and Heyan Huang and Furu Wei}, |
|
booktitle = "Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)", |
|
month = aug, |
|
year = "2021", |
|
address = "Online", |
|
publisher = "Association for Computational Linguistics", |
|
url = "https://aclanthology.org/2021.acl-long.265", |
|
doi = "10.18653/v1/2021.acl-long.265", |
|
pages = "3418--3430",} |
|
``` |