File size: 2,314 Bytes
aa979bc
a38dae3
 
 
 
 
 
aa979bc
 
a38dae3
 
 
 
675531e
a38dae3
675531e
 
a38dae3
de37172
a38dae3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a9fe469
a38dae3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a9fe469
a38dae3
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
---
language: multilingual
widget:
- text: πŸ€—πŸ€—πŸ€—<mask>
- text: πŸ”₯The goal of life is <mask> . πŸ”₯
- text: Il segreto della vita Γ¨ l’<mask> . ❀️
- text: Hasta <mask> πŸ‘‹!
license: mit
---


# Twitter-XLM-Roberta-large
This is a XLM-T large language model specialised on Twitter. 
The base model is a multilingual XLM-R which was re-trained on over 1 billion tweets from different languages until December 2022.

To evaluate this and other LMs on Twitter-specific data, please refer to the [XLM-T main repository](https://github.com/cardiffnlp/xlm-t). 
A base-size XLM-T model and sample code is available [here](https://huggingface.co/cardiffnlp/twitter-xlm-roberta-base).

Finally, this model is fully compatible with the [TweetNLP library](https://github.com/cardiffnlp/tweetnlp).


### BibTeX entry and citation info

More information in the reference papers about [multilingual language models on Twitter](https://aclanthology.org/2022.lrec-1.27/) and [time-specific models](https://aclanthology.org/2022.acl-demo.25/).
Please cite the relevant reference papers if you use this model. 

```bibtex
@inproceedings{barbieri-etal-2022-xlm,
    title = "{XLM}-{T}: Multilingual Language Models in {T}witter for Sentiment Analysis and Beyond",
    author = "Barbieri, Francesco  and
      Espinosa Anke, Luis  and
      Camacho-Collados, Jose",
    booktitle = "Proceedings of the Thirteenth Language Resources and Evaluation Conference",
    month = jun,
    year = "2022",
    address = "Marseille, France",
    publisher = "European Language Resources Association",
    url = "https://aclanthology.org/2022.lrec-1.27",
    pages = "258--266"
}

@inproceedings{loureiro-etal-2022-timelms,
    title = "{T}ime{LM}s: Diachronic Language Models from {T}witter",
    author = "Loureiro, Daniel  and
      Barbieri, Francesco  and
      Neves, Leonardo  and
      Espinosa Anke, Luis  and
      Camacho-collados, Jose",
    booktitle = "Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: System Demonstrations",
    month = may,
    year = "2022",
    address = "Dublin, Ireland",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2022.acl-demo.25",
    doi = "10.18653/v1/2022.acl-demo.25",
    pages = "251--260"
}