sgugger Marissa commited on
Commit
509c94b
1 Parent(s): 081c89f

Add model card (#1)

Browse files

- Add model card (b97387cd544e002d21bb43d04862d9006ce96d1c)


Co-authored-by: Marissa Gerchick <[email protected]>

Files changed (1) hide show
  1. README.md +111 -0
README.md CHANGED
@@ -1,10 +1,121 @@
1
  ---
 
2
  tags:
3
  - exbert
4
 
5
  license: cc-by-nc-4.0
6
  ---
7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  <a href="https://huggingface.co/exbert/?model=xlm-mlm-en-2048">
9
  <img width="300px" src="https://cdn-media.huggingface.co/exbert/button.png">
10
  </a>
 
1
  ---
2
+ language: en
3
  tags:
4
  - exbert
5
 
6
  license: cc-by-nc-4.0
7
  ---
8
 
9
+ # xlm-mlm-en-2048
10
+
11
+ # Table of Contents
12
+
13
+ 1. [Model Details](#model-details)
14
+ 2. [Uses](#uses)
15
+ 3. [Bias, Risks, and Limitations](#bias-risks-and-limitations)
16
+ 4. [Training](#training)
17
+ 5. [Evaluation](#evaluation)
18
+ 6. [Environmental Impact](#environmental-impact)
19
+ 7. [Citation](#citation)
20
+ 8. [Model Card Authors](#model-card-authors)
21
+ 9. [How To Get Started With the Model](#how-to-get-started-with-the-model)
22
+
23
+
24
+ # Model Details
25
+
26
+ The XLM model was proposed in [Cross-lingual Language Model Pretraining](https://arxiv.org/abs/1901.07291) by Guillaume Lample and Alexis Conneau. It’s a transformer pretrained with either a causal language modeling (CLM) objective (next token prediction), a masked language modeling (MLM) objective (BERT-like), or
27
+ a Translation Language Modeling (TLM) object (extension of BERT’s MLM to multiple language inputs). This model is trained with a masked language modeling objective on English text.
28
+
29
+ ## Model Description
30
+
31
+ - **Developed by:** Researchers affiliated with Facebook AI, see [associated paper](https://arxiv.org/abs/1901.07291) and [GitHub Repo](https://github.com/facebookresearch/XLM)
32
+ - **Model type:** Language model
33
+ - **Language(s) (NLP):** English
34
+ - **License:** CC-BY-NC-4.0
35
+ - **Related Models:** Other [XLM models](https://huggingface.co/models?sort=downloads&search=xlm)
36
+ - **Resources for more information:**
37
+ - [Cross-lingual Language Model Pretraining](https://arxiv.org/abs/1901.07291) by Guillaume Lample and Alexis Conneau (2019)
38
+ - [Unsupervised Cross-lingual Representation Learning at Scale](https://arxiv.org/pdf/1911.02116.pdf) by Conneau et al. (2020)
39
+ - [GitHub Repo](https://github.com/facebookresearch/XLM)
40
+ - [Hugging Face XLM docs](https://huggingface.co/docs/transformers/model_doc/xlm)
41
+
42
+ # Uses
43
+
44
+ ## Direct Use
45
+
46
+ The model is a language model. The model can be used for masked language modeling.
47
+
48
+ ## Downstream Use
49
+
50
+ To learn more about this task and potential downstream uses, see the Hugging Face [fill mask docs](https://huggingface.co/tasks/fill-mask) and the [Hugging Face Multilingual Models for Inference](https://huggingface.co/docs/transformers/v4.20.1/en/multilingual#xlm-with-language-embeddings) docs. Also see the [associated paper](https://arxiv.org/abs/1901.07291).
51
+
52
+ ## Out-of-Scope Use
53
+
54
+ The model should not be used to intentionally create hostile or alienating environments for people.
55
+
56
+ # Bias, Risks, and Limitations
57
+
58
+ Significant research has explored bias and fairness issues with language models (see, e.g., [Sheng et al. (2021)](https://aclanthology.org/2021.acl-long.330.pdf) and [Bender et al. (2021)](https://dl.acm.org/doi/pdf/10.1145/3442188.3445922)).
59
+
60
+ ## Recommendations
61
+
62
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model.
63
+
64
+ # Training
65
+
66
+ More information needed. See the [associated GitHub Repo](https://github.com/facebookresearch/XLM).
67
+
68
+ # Evaluation
69
+
70
+ More information needed. See the [associated GitHub Repo](https://github.com/facebookresearch/XLM).
71
+
72
+ # Environmental Impact
73
+
74
+ Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
75
+
76
+ - **Hardware Type:** More information needed
77
+ - **Hours used:** More information needed
78
+ - **Cloud Provider:** More information needed
79
+ - **Compute Region:** More information needed
80
+ - **Carbon Emitted:** More information needed
81
+
82
+ # Citation
83
+
84
+ **BibTeX:**
85
+
86
+ ```bibtex
87
+ @article{lample2019cross,
88
+ title={Cross-lingual language model pretraining},
89
+ author={Lample, Guillaume and Conneau, Alexis},
90
+ journal={arXiv preprint arXiv:1901.07291},
91
+ year={2019}
92
+ }
93
+ ```
94
+
95
+ **APA:**
96
+ - Lample, G., & Conneau, A. (2019). Cross-lingual language model pretraining. arXiv preprint arXiv:1901.07291.
97
+
98
+ # Model Card Authors
99
+
100
+ This model card was written by the team at Hugging Face.
101
+
102
+ # How to Get Started with the Model
103
+
104
+ Use the code below to get started with the model. See the [Hugging Face XLM docs](https://huggingface.co/docs/transformers/model_doc/xlm) for more examples.
105
+
106
+ ```python
107
+ from transformers import XLMTokenizer, XLMModel
108
+ import torch
109
+
110
+ tokenizer = XLMTokenizer.from_pretrained("xlm-mlm-en-2048")
111
+ model = XLMModel.from_pretrained("xlm-mlm-en-2048")
112
+
113
+ inputs = tokenizer("Hello, my dog is cute", return_tensors="pt")
114
+ outputs = model(**inputs)
115
+
116
+ last_hidden_states = outputs.last_hidden_state
117
+ ```
118
+
119
  <a href="https://huggingface.co/exbert/?model=xlm-mlm-en-2048">
120
  <img width="300px" src="https://cdn-media.huggingface.co/exbert/button.png">
121
  </a>