aamixsh commited on
Commit
5b7ae6a
·
1 Parent(s): adf495e

Auto commit

Browse files
README.md DELETED
@@ -1,199 +0,0 @@
1
- ---
2
- library_name: transformers
3
- tags: []
4
- ---
5
-
6
- # Model Card for Model ID
7
-
8
- <!-- Provide a quick summary of what the model is/does. -->
9
-
10
-
11
-
12
- ## Model Details
13
-
14
- ### Model Description
15
-
16
- <!-- Provide a longer summary of what this model is. -->
17
-
18
- This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
19
-
20
- - **Developed by:** [More Information Needed]
21
- - **Funded by [optional]:** [More Information Needed]
22
- - **Shared by [optional]:** [More Information Needed]
23
- - **Model type:** [More Information Needed]
24
- - **Language(s) (NLP):** [More Information Needed]
25
- - **License:** [More Information Needed]
26
- - **Finetuned from model [optional]:** [More Information Needed]
27
-
28
- ### Model Sources [optional]
29
-
30
- <!-- Provide the basic links for the model. -->
31
-
32
- - **Repository:** [More Information Needed]
33
- - **Paper [optional]:** [More Information Needed]
34
- - **Demo [optional]:** [More Information Needed]
35
-
36
- ## Uses
37
-
38
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
39
-
40
- ### Direct Use
41
-
42
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
43
-
44
- [More Information Needed]
45
-
46
- ### Downstream Use [optional]
47
-
48
- <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
49
-
50
- [More Information Needed]
51
-
52
- ### Out-of-Scope Use
53
-
54
- <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
55
-
56
- [More Information Needed]
57
-
58
- ## Bias, Risks, and Limitations
59
-
60
- <!-- This section is meant to convey both technical and sociotechnical limitations. -->
61
-
62
- [More Information Needed]
63
-
64
- ### Recommendations
65
-
66
- <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
67
-
68
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
69
-
70
- ## How to Get Started with the Model
71
-
72
- Use the code below to get started with the model.
73
-
74
- [More Information Needed]
75
-
76
- ## Training Details
77
-
78
- ### Training Data
79
-
80
- <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
81
-
82
- [More Information Needed]
83
-
84
- ### Training Procedure
85
-
86
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
87
-
88
- #### Preprocessing [optional]
89
-
90
- [More Information Needed]
91
-
92
-
93
- #### Training Hyperparameters
94
-
95
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
96
-
97
- #### Speeds, Sizes, Times [optional]
98
-
99
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
100
-
101
- [More Information Needed]
102
-
103
- ## Evaluation
104
-
105
- <!-- This section describes the evaluation protocols and provides the results. -->
106
-
107
- ### Testing Data, Factors & Metrics
108
-
109
- #### Testing Data
110
-
111
- <!-- This should link to a Dataset Card if possible. -->
112
-
113
- [More Information Needed]
114
-
115
- #### Factors
116
-
117
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
118
-
119
- [More Information Needed]
120
-
121
- #### Metrics
122
-
123
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
124
-
125
- [More Information Needed]
126
-
127
- ### Results
128
-
129
- [More Information Needed]
130
-
131
- #### Summary
132
-
133
-
134
-
135
- ## Model Examination [optional]
136
-
137
- <!-- Relevant interpretability work for the model goes here -->
138
-
139
- [More Information Needed]
140
-
141
- ## Environmental Impact
142
-
143
- <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
144
-
145
- Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
146
-
147
- - **Hardware Type:** [More Information Needed]
148
- - **Hours used:** [More Information Needed]
149
- - **Cloud Provider:** [More Information Needed]
150
- - **Compute Region:** [More Information Needed]
151
- - **Carbon Emitted:** [More Information Needed]
152
-
153
- ## Technical Specifications [optional]
154
-
155
- ### Model Architecture and Objective
156
-
157
- [More Information Needed]
158
-
159
- ### Compute Infrastructure
160
-
161
- [More Information Needed]
162
-
163
- #### Hardware
164
-
165
- [More Information Needed]
166
-
167
- #### Software
168
-
169
- [More Information Needed]
170
-
171
- ## Citation [optional]
172
-
173
- <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
174
-
175
- **BibTeX:**
176
-
177
- [More Information Needed]
178
-
179
- **APA:**
180
-
181
- [More Information Needed]
182
-
183
- ## Glossary [optional]
184
-
185
- <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
186
-
187
- [More Information Needed]
188
-
189
- ## More Information [optional]
190
-
191
- [More Information Needed]
192
-
193
- ## Model Card Authors [optional]
194
-
195
- [More Information Needed]
196
-
197
- ## Model Card Contact
198
-
199
- [More Information Needed]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
special_tokens_map.json CHANGED
@@ -1,7 +1,7 @@
1
- {
2
- "cls_token": "[CLS]",
3
- "mask_token": "[MASK]",
4
- "pad_token": "[PAD]",
5
- "sep_token": "[SEP]",
6
- "unk_token": "[UNK]"
7
- }
 
1
+ {
2
+ "cls_token": "[CLS]",
3
+ "mask_token": "[MASK]",
4
+ "pad_token": "[PAD]",
5
+ "sep_token": "[SEP]",
6
+ "unk_token": "[UNK]"
7
+ }
tokenizer.json CHANGED
@@ -5,7 +5,7 @@
5
  "added_tokens": [
6
  {
7
  "id": 0,
8
- "content": "[CLS]",
9
  "single_word": false,
10
  "lstrip": false,
11
  "rstrip": false,
@@ -14,7 +14,7 @@
14
  },
15
  {
16
  "id": 1,
17
- "content": "[SEP]",
18
  "single_word": false,
19
  "lstrip": false,
20
  "rstrip": false,
@@ -23,7 +23,7 @@
23
  },
24
  {
25
  "id": 2,
26
- "content": "[PAD]",
27
  "single_word": false,
28
  "lstrip": false,
29
  "rstrip": false,
@@ -32,7 +32,7 @@
32
  },
33
  {
34
  "id": 3,
35
- "content": "[UNK]",
36
  "single_word": false,
37
  "lstrip": false,
38
  "rstrip": false,
@@ -41,7 +41,7 @@
41
  },
42
  {
43
  "id": 4,
44
- "content": "[MASK]",
45
  "single_word": false,
46
  "lstrip": false,
47
  "rstrip": false,
@@ -50,7 +50,7 @@
50
  },
51
  {
52
  "id": 5,
53
- "content": "[ILLEGAL]",
54
  "single_word": false,
55
  "lstrip": false,
56
  "rstrip": false,
@@ -59,7 +59,16 @@
59
  },
60
  {
61
  "id": 6,
62
- "content": "~",
 
 
 
 
 
 
 
 
 
63
  "single_word": false,
64
  "lstrip": false,
65
  "rstrip": false,
@@ -90,11 +99,11 @@
90
  "type": "RobertaProcessing",
91
  "sep": [
92
  "[SEP]",
93
- 1
94
  ],
95
  "cls": [
96
  "[CLS]",
97
- 0
98
  ],
99
  "trim_offsets": false,
100
  "add_prefix_space": false
@@ -110,53 +119,47 @@
110
  "continuing_subword_prefix": "##",
111
  "max_input_chars_per_word": 100,
112
  "vocab": {
113
- "[CLS]": 0,
114
- "[SEP]": 1,
115
- "[PAD]": 2,
116
- "[UNK]": 3,
117
- "[MASK]": 4,
118
- "[ILLEGAL]": 5,
119
- "~": 6,
120
- ">": 7,
121
  " ": 8,
122
- "#": 9,
123
- "+": 10,
124
- "-": 11,
125
- "/": 12,
126
- "0": 13,
127
- "1": 14,
128
- "2": 15,
129
- "3": 16,
130
- "4": 17,
131
- "5": 18,
132
- "6": 19,
133
- "7": 20,
134
- "8": 21,
135
- "9": 22,
136
- "=": 23,
137
- "B": 24,
138
- "K": 25,
139
- "N": 26,
140
- "O": 27,
141
- "P": 28,
142
- "Q": 29,
143
- "R": 30,
144
- "a": 31,
145
- "b": 32,
146
- "c": 33,
147
- "d": 34,
148
- "e": 35,
149
- "f": 36,
150
- "g": 37,
151
- "h": 38,
152
- "k": 39,
153
- "n": 40,
154
- "p": 41,
155
- "q": 42,
156
- "r": 43,
157
- "w": 44,
158
- "x": 45,
159
- "_": 46
160
  }
161
  }
162
  }
 
5
  "added_tokens": [
6
  {
7
  "id": 0,
8
+ "content": "~",
9
  "single_word": false,
10
  "lstrip": false,
11
  "rstrip": false,
 
14
  },
15
  {
16
  "id": 1,
17
+ "content": ">",
18
  "single_word": false,
19
  "lstrip": false,
20
  "rstrip": false,
 
23
  },
24
  {
25
  "id": 2,
26
+ "content": "[CLS]",
27
  "single_word": false,
28
  "lstrip": false,
29
  "rstrip": false,
 
32
  },
33
  {
34
  "id": 3,
35
+ "content": "[SEP]",
36
  "single_word": false,
37
  "lstrip": false,
38
  "rstrip": false,
 
41
  },
42
  {
43
  "id": 4,
44
+ "content": "[PAD]",
45
  "single_word": false,
46
  "lstrip": false,
47
  "rstrip": false,
 
50
  },
51
  {
52
  "id": 5,
53
+ "content": "[UNK]",
54
  "single_word": false,
55
  "lstrip": false,
56
  "rstrip": false,
 
59
  },
60
  {
61
  "id": 6,
62
+ "content": "[MASK]",
63
+ "single_word": false,
64
+ "lstrip": false,
65
+ "rstrip": false,
66
+ "normalized": false,
67
+ "special": true
68
+ },
69
+ {
70
+ "id": 7,
71
+ "content": "[ILLEGAL]",
72
  "single_word": false,
73
  "lstrip": false,
74
  "rstrip": false,
 
99
  "type": "RobertaProcessing",
100
  "sep": [
101
  "[SEP]",
102
+ 3
103
  ],
104
  "cls": [
105
  "[CLS]",
106
+ 2
107
  ],
108
  "trim_offsets": false,
109
  "add_prefix_space": false
 
119
  "continuing_subword_prefix": "##",
120
  "max_input_chars_per_word": 100,
121
  "vocab": {
122
+ "~": 0,
123
+ ">": 1,
124
+ "[CLS]": 2,
125
+ "[SEP]": 3,
126
+ "[PAD]": 4,
127
+ "[UNK]": 5,
128
+ "[MASK]": 6,
129
+ "[ILLEGAL]": 7,
130
  " ": 8,
131
+ "-": 9,
132
+ "/": 10,
133
+ "0": 11,
134
+ "1": 12,
135
+ "2": 13,
136
+ "3": 14,
137
+ "4": 15,
138
+ "5": 16,
139
+ "6": 17,
140
+ "7": 18,
141
+ "8": 19,
142
+ "9": 20,
143
+ "B": 21,
144
+ "K": 22,
145
+ "N": 23,
146
+ "P": 24,
147
+ "Q": 25,
148
+ "R": 26,
149
+ "a": 27,
150
+ "b": 28,
151
+ "c": 29,
152
+ "d": 30,
153
+ "e": 31,
154
+ "f": 32,
155
+ "g": 33,
156
+ "h": 34,
157
+ "k": 35,
158
+ "n": 36,
159
+ "p": 37,
160
+ "q": 38,
161
+ "r": 39,
162
+ "w": 40
 
 
 
 
 
 
163
  }
164
  }
165
  }
tokenizer_config.json CHANGED
@@ -1,7 +1,7 @@
1
  {
2
  "added_tokens_decoder": {
3
  "0": {
4
- "content": "[CLS]",
5
  "lstrip": false,
6
  "normalized": false,
7
  "rstrip": false,
@@ -9,7 +9,7 @@
9
  "special": true
10
  },
11
  "1": {
12
- "content": "[SEP]",
13
  "lstrip": false,
14
  "normalized": false,
15
  "rstrip": false,
@@ -17,7 +17,7 @@
17
  "special": true
18
  },
19
  "2": {
20
- "content": "[PAD]",
21
  "lstrip": false,
22
  "normalized": false,
23
  "rstrip": false,
@@ -25,7 +25,7 @@
25
  "special": true
26
  },
27
  "3": {
28
- "content": "[UNK]",
29
  "lstrip": false,
30
  "normalized": false,
31
  "rstrip": false,
@@ -33,7 +33,7 @@
33
  "special": true
34
  },
35
  "4": {
36
- "content": "[MASK]",
37
  "lstrip": false,
38
  "normalized": false,
39
  "rstrip": false,
@@ -41,7 +41,7 @@
41
  "special": true
42
  },
43
  "5": {
44
- "content": "[ILLEGAL]",
45
  "lstrip": false,
46
  "normalized": false,
47
  "rstrip": false,
@@ -49,7 +49,15 @@
49
  "special": true
50
  },
51
  "6": {
52
- "content": "~",
 
 
 
 
 
 
 
 
53
  "lstrip": false,
54
  "normalized": false,
55
  "rstrip": false,
 
1
  {
2
  "added_tokens_decoder": {
3
  "0": {
4
+ "content": "~",
5
  "lstrip": false,
6
  "normalized": false,
7
  "rstrip": false,
 
9
  "special": true
10
  },
11
  "1": {
12
+ "content": ">",
13
  "lstrip": false,
14
  "normalized": false,
15
  "rstrip": false,
 
17
  "special": true
18
  },
19
  "2": {
20
+ "content": "[CLS]",
21
  "lstrip": false,
22
  "normalized": false,
23
  "rstrip": false,
 
25
  "special": true
26
  },
27
  "3": {
28
+ "content": "[SEP]",
29
  "lstrip": false,
30
  "normalized": false,
31
  "rstrip": false,
 
33
  "special": true
34
  },
35
  "4": {
36
+ "content": "[PAD]",
37
  "lstrip": false,
38
  "normalized": false,
39
  "rstrip": false,
 
41
  "special": true
42
  },
43
  "5": {
44
+ "content": "[UNK]",
45
  "lstrip": false,
46
  "normalized": false,
47
  "rstrip": false,
 
49
  "special": true
50
  },
51
  "6": {
52
+ "content": "[MASK]",
53
+ "lstrip": false,
54
+ "normalized": false,
55
+ "rstrip": false,
56
+ "single_word": false,
57
+ "special": true
58
+ },
59
+ "7": {
60
+ "content": "[ILLEGAL]",
61
  "lstrip": false,
62
  "normalized": false,
63
  "rstrip": false,
unwrapped_tokenizer.json ADDED
@@ -0,0 +1,165 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "version": "1.0",
3
+ "truncation": null,
4
+ "padding": null,
5
+ "added_tokens": [
6
+ {
7
+ "id": 0,
8
+ "content": "~",
9
+ "single_word": false,
10
+ "lstrip": false,
11
+ "rstrip": false,
12
+ "normalized": false,
13
+ "special": true
14
+ },
15
+ {
16
+ "id": 1,
17
+ "content": ">",
18
+ "single_word": false,
19
+ "lstrip": false,
20
+ "rstrip": false,
21
+ "normalized": false,
22
+ "special": true
23
+ },
24
+ {
25
+ "id": 2,
26
+ "content": "[CLS]",
27
+ "single_word": false,
28
+ "lstrip": false,
29
+ "rstrip": false,
30
+ "normalized": false,
31
+ "special": true
32
+ },
33
+ {
34
+ "id": 3,
35
+ "content": "[SEP]",
36
+ "single_word": false,
37
+ "lstrip": false,
38
+ "rstrip": false,
39
+ "normalized": false,
40
+ "special": true
41
+ },
42
+ {
43
+ "id": 4,
44
+ "content": "[PAD]",
45
+ "single_word": false,
46
+ "lstrip": false,
47
+ "rstrip": false,
48
+ "normalized": false,
49
+ "special": true
50
+ },
51
+ {
52
+ "id": 5,
53
+ "content": "[UNK]",
54
+ "single_word": false,
55
+ "lstrip": false,
56
+ "rstrip": false,
57
+ "normalized": false,
58
+ "special": true
59
+ },
60
+ {
61
+ "id": 6,
62
+ "content": "[MASK]",
63
+ "single_word": false,
64
+ "lstrip": false,
65
+ "rstrip": false,
66
+ "normalized": false,
67
+ "special": true
68
+ },
69
+ {
70
+ "id": 7,
71
+ "content": "[ILLEGAL]",
72
+ "single_word": false,
73
+ "lstrip": false,
74
+ "rstrip": false,
75
+ "normalized": false,
76
+ "special": true
77
+ }
78
+ ],
79
+ "normalizer": {
80
+ "type": "Sequence",
81
+ "normalizers": [
82
+ {
83
+ "type": "NFD"
84
+ },
85
+ {
86
+ "type": "StripAccents"
87
+ }
88
+ ]
89
+ },
90
+ "pre_tokenizer": {
91
+ "type": "Split",
92
+ "pattern": {
93
+ "String": ""
94
+ },
95
+ "behavior": "Isolated",
96
+ "invert": false
97
+ },
98
+ "post_processor": {
99
+ "type": "RobertaProcessing",
100
+ "sep": [
101
+ "[SEP]",
102
+ 3
103
+ ],
104
+ "cls": [
105
+ "[CLS]",
106
+ 2
107
+ ],
108
+ "trim_offsets": false,
109
+ "add_prefix_space": false
110
+ },
111
+ "decoder": {
112
+ "type": "WordPiece",
113
+ "prefix": "##",
114
+ "cleanup": true
115
+ },
116
+ "model": {
117
+ "type": "WordPiece",
118
+ "unk_token": "[UNK]",
119
+ "continuing_subword_prefix": "##",
120
+ "max_input_chars_per_word": 100,
121
+ "vocab": {
122
+ "~": 0,
123
+ ">": 1,
124
+ "[CLS]": 2,
125
+ "[SEP]": 3,
126
+ "[PAD]": 4,
127
+ "[UNK]": 5,
128
+ "[MASK]": 6,
129
+ "[ILLEGAL]": 7,
130
+ " ": 8,
131
+ "-": 9,
132
+ "/": 10,
133
+ "0": 11,
134
+ "1": 12,
135
+ "2": 13,
136
+ "3": 14,
137
+ "4": 15,
138
+ "5": 16,
139
+ "6": 17,
140
+ "7": 18,
141
+ "8": 19,
142
+ "9": 20,
143
+ "B": 21,
144
+ "K": 22,
145
+ "N": 23,
146
+ "P": 24,
147
+ "Q": 25,
148
+ "R": 26,
149
+ "a": 27,
150
+ "b": 28,
151
+ "c": 29,
152
+ "d": 30,
153
+ "e": 31,
154
+ "f": 32,
155
+ "g": 33,
156
+ "h": 34,
157
+ "k": 35,
158
+ "n": 36,
159
+ "p": 37,
160
+ "q": 38,
161
+ "r": 39,
162
+ "w": 40
163
+ }
164
+ }
165
+ }