jtatman commited on
Commit
555213d
·
verified ·
1 Parent(s): 6799734

Add new SentenceTransformer model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 384,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,362 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - sentence-transformers
4
+ - sentence-similarity
5
+ - feature-extraction
6
+ - generated_from_trainer
7
+ - dataset_size:2402
8
+ - loss:TripletLoss
9
+ base_model: sentence-transformers/paraphrase-MiniLM-L6-v2
10
+ widget:
11
+ - source_sentence: ' Among the following, which is not a predictor of good response
12
+ to ECT in patients with schizophrenia?'
13
+ sentences:
14
+ - (
15
+ - A. Recent onset B. Shorter duration of illnessn C. Mood incongruent delusionsn D.
16
+ Presence of affective symptoms
17
+ - A
18
+ - source_sentence: ' Who first described autism?'
19
+ sentences:
20
+ - A. Kanner B. Asperger C. Chess D. Benhamn E. None of the above
21
+ - (
22
+ - A
23
+ - source_sentence: ' Disorientation to place is seen in'
24
+ sentences:
25
+ - A
26
+ - A
27
+ - A. Severe anxiety B. Wernickes encephalopathyn C. Korsakoffs psychosis D. Acute
28
+ manic episoden E. Depression
29
+ - source_sentence: ' Which of the following most accurately describes the pathologic
30
+ process in multiple sclerosis?'
31
+ sentences:
32
+ - A
33
+ - A. Inflammatory B. Infectious C. Degenerativen D. Demyelinating E. Metabolic
34
+ - A
35
+ - source_sentence: What term would a behaviorist use for an external event or object
36
+ that elicits a behavior in an organism?
37
+ sentences:
38
+ - (
39
+ - A
40
+ - (A)Punishment (B)Reward (C)Instinct (D)Responsen (E)Reinforcement
41
+ pipeline_tag: sentence-similarity
42
+ library_name: sentence-transformers
43
+ ---
44
+
45
+ # SentenceTransformer based on sentence-transformers/paraphrase-MiniLM-L6-v2
46
+
47
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/paraphrase-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/paraphrase-MiniLM-L6-v2). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
48
+
49
+ ## Model Details
50
+
51
+ ### Model Description
52
+ - **Model Type:** Sentence Transformer
53
+ - **Base model:** [sentence-transformers/paraphrase-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/paraphrase-MiniLM-L6-v2) <!-- at revision 9a27583f9c2cc7c03a95c08c5f087318109e2613 -->
54
+ - **Maximum Sequence Length:** 128 tokens
55
+ - **Output Dimensionality:** 384 dimensions
56
+ - **Similarity Function:** Cosine Similarity
57
+ <!-- - **Training Dataset:** Unknown -->
58
+ <!-- - **Language:** Unknown -->
59
+ <!-- - **License:** Unknown -->
60
+
61
+ ### Model Sources
62
+
63
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
64
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
65
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
66
+
67
+ ### Full Model Architecture
68
+
69
+ ```
70
+ SentenceTransformer(
71
+ (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: BertModel
72
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
73
+ )
74
+ ```
75
+
76
+ ## Usage
77
+
78
+ ### Direct Usage (Sentence Transformers)
79
+
80
+ First install the Sentence Transformers library:
81
+
82
+ ```bash
83
+ pip install -U sentence-transformers
84
+ ```
85
+
86
+ Then you can load this model and run inference.
87
+ ```python
88
+ from sentence_transformers import SentenceTransformer
89
+
90
+ # Download from the 🤗 Hub
91
+ model = SentenceTransformer("jtatman/paraphrase-minilm-l6-psychology")
92
+ # Run inference
93
+ sentences = [
94
+ 'What term would a behaviorist use for an external event or object that elicits a behavior in an organism?',
95
+ '(',
96
+ '(A)Punishment (B)Reward (C)Instinct (D)Responsen (E)Reinforcement',
97
+ ]
98
+ embeddings = model.encode(sentences)
99
+ print(embeddings.shape)
100
+ # [3, 384]
101
+
102
+ # Get the similarity scores for the embeddings
103
+ similarities = model.similarity(embeddings, embeddings)
104
+ print(similarities.shape)
105
+ # [3, 3]
106
+ ```
107
+
108
+ <!--
109
+ ### Direct Usage (Transformers)
110
+
111
+ <details><summary>Click to see the direct usage in Transformers</summary>
112
+
113
+ </details>
114
+ -->
115
+
116
+ <!--
117
+ ### Downstream Usage (Sentence Transformers)
118
+
119
+ You can finetune this model on your own dataset.
120
+
121
+ <details><summary>Click to expand</summary>
122
+
123
+ </details>
124
+ -->
125
+
126
+ <!--
127
+ ### Out-of-Scope Use
128
+
129
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
130
+ -->
131
+
132
+ <!--
133
+ ## Bias, Risks and Limitations
134
+
135
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
136
+ -->
137
+
138
+ <!--
139
+ ### Recommendations
140
+
141
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
142
+ -->
143
+
144
+ ## Training Details
145
+
146
+ ### Training Dataset
147
+
148
+ #### Unnamed Dataset
149
+
150
+ * Size: 2,402 training samples
151
+ * Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>sentence_2</code>
152
+ * Approximate statistics based on the first 1000 samples:
153
+ | | sentence_0 | sentence_1 | sentence_2 |
154
+ |:--------|:-----------------------------------------------------------------------------------|:-------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
155
+ | type | string | string | string |
156
+ | details | <ul><li>min: 5 tokens</li><li>mean: 33.04 tokens</li><li>max: 128 tokens</li></ul> | <ul><li>min: 3 tokens</li><li>mean: 3.0 tokens</li><li>max: 3 tokens</li></ul> | <ul><li>min: 2 tokens</li><li>mean: 32.74 tokens</li><li>max: 128 tokens</li></ul> |
157
+ * Samples:
158
+ | sentence_0 | sentence_1 | sentence_2 |
159
+ |:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
160
+ | <code> Which of the following disorders is associated with increased risk of mood disorders and suicide?</code> | <code>A</code> | <code>A. Multiple sclerosis B. Huntingtons disease C. Epilepsyn D. Brain injury E. All of the above</code> |
161
+ | <code> A 68-year-old man is admitted to an acute psychiatric unit for severe suicidal ideation. He is very much preoccupied with death and refuses to agree to a contract for safety. The diagnostician determines the patient to be severely depressed because of noncompliance with medication and severe social stressors. The patient refuses to take any medication because, he says, Nothing will change, anyway. He also stops eating and drinking and becomes increasingly dehydrated. A reasonable choice of treatment in this patient would be to</code> | <code>A</code> | <code>A. Persuade the patient to take antidepressants B. Wait and watch for the patient to change his mind C. Restrain the patient and administer intravenous fluids D. Prescribe electroconvulsive therapy E. Prescribe intensive psychotherapy 41. A 48-year-old man with treatment-resistant schizophrenia has been relatively stable for the past 6 months on clozapine. On a routine follow-up visitn the patient is observed to be depressed and reports lack of appetite and insomnian among other features of depression. The attending psychiatrist decides to treat the patient with antidepressants. Which of the following antidepressants would mandate particular caution in this patient?n A. Mirtazapine B. Fluoxetine C. Sertralinen D. Citalopram E. Trazodone</code> |
162
+ | <code>Which of the following is the best definition of a response?</code> | <code>(</code> | <code>(A)A cognitive interpretation or a memory of an eventn (B)An external event or object that elicits a behavior in an organismn (C)A long-term change in behavior caused by past experiencesn (D)External energy or chemicals that are changed into neural impulsesn (E)A physical reaction or behavior elicited by an external event or object</code> |
163
+ * Loss: [<code>TripletLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#tripletloss) with these parameters:
164
+ ```json
165
+ {
166
+ "distance_metric": "TripletDistanceMetric.EUCLIDEAN",
167
+ "triplet_margin": 5
168
+ }
169
+ ```
170
+
171
+ ### Training Hyperparameters
172
+ #### Non-Default Hyperparameters
173
+
174
+ - `per_device_train_batch_size`: 32
175
+ - `per_device_eval_batch_size`: 32
176
+ - `num_train_epochs`: 15
177
+ - `fp16`: True
178
+ - `multi_dataset_batch_sampler`: round_robin
179
+
180
+ #### All Hyperparameters
181
+ <details><summary>Click to expand</summary>
182
+
183
+ - `overwrite_output_dir`: False
184
+ - `do_predict`: False
185
+ - `eval_strategy`: no
186
+ - `prediction_loss_only`: True
187
+ - `per_device_train_batch_size`: 32
188
+ - `per_device_eval_batch_size`: 32
189
+ - `per_gpu_train_batch_size`: None
190
+ - `per_gpu_eval_batch_size`: None
191
+ - `gradient_accumulation_steps`: 1
192
+ - `eval_accumulation_steps`: None
193
+ - `torch_empty_cache_steps`: None
194
+ - `learning_rate`: 5e-05
195
+ - `weight_decay`: 0.0
196
+ - `adam_beta1`: 0.9
197
+ - `adam_beta2`: 0.999
198
+ - `adam_epsilon`: 1e-08
199
+ - `max_grad_norm`: 1
200
+ - `num_train_epochs`: 15
201
+ - `max_steps`: -1
202
+ - `lr_scheduler_type`: linear
203
+ - `lr_scheduler_kwargs`: {}
204
+ - `warmup_ratio`: 0.0
205
+ - `warmup_steps`: 0
206
+ - `log_level`: passive
207
+ - `log_level_replica`: warning
208
+ - `log_on_each_node`: True
209
+ - `logging_nan_inf_filter`: True
210
+ - `save_safetensors`: True
211
+ - `save_on_each_node`: False
212
+ - `save_only_model`: False
213
+ - `restore_callback_states_from_checkpoint`: False
214
+ - `no_cuda`: False
215
+ - `use_cpu`: False
216
+ - `use_mps_device`: False
217
+ - `seed`: 42
218
+ - `data_seed`: None
219
+ - `jit_mode_eval`: False
220
+ - `use_ipex`: False
221
+ - `bf16`: False
222
+ - `fp16`: True
223
+ - `fp16_opt_level`: O1
224
+ - `half_precision_backend`: auto
225
+ - `bf16_full_eval`: False
226
+ - `fp16_full_eval`: False
227
+ - `tf32`: None
228
+ - `local_rank`: 0
229
+ - `ddp_backend`: None
230
+ - `tpu_num_cores`: None
231
+ - `tpu_metrics_debug`: False
232
+ - `debug`: []
233
+ - `dataloader_drop_last`: False
234
+ - `dataloader_num_workers`: 0
235
+ - `dataloader_prefetch_factor`: None
236
+ - `past_index`: -1
237
+ - `disable_tqdm`: False
238
+ - `remove_unused_columns`: True
239
+ - `label_names`: None
240
+ - `load_best_model_at_end`: False
241
+ - `ignore_data_skip`: False
242
+ - `fsdp`: []
243
+ - `fsdp_min_num_params`: 0
244
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
245
+ - `fsdp_transformer_layer_cls_to_wrap`: None
246
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
247
+ - `deepspeed`: None
248
+ - `label_smoothing_factor`: 0.0
249
+ - `optim`: adamw_torch
250
+ - `optim_args`: None
251
+ - `adafactor`: False
252
+ - `group_by_length`: False
253
+ - `length_column_name`: length
254
+ - `ddp_find_unused_parameters`: None
255
+ - `ddp_bucket_cap_mb`: None
256
+ - `ddp_broadcast_buffers`: False
257
+ - `dataloader_pin_memory`: True
258
+ - `dataloader_persistent_workers`: False
259
+ - `skip_memory_metrics`: True
260
+ - `use_legacy_prediction_loop`: False
261
+ - `push_to_hub`: False
262
+ - `resume_from_checkpoint`: None
263
+ - `hub_model_id`: None
264
+ - `hub_strategy`: every_save
265
+ - `hub_private_repo`: None
266
+ - `hub_always_push`: False
267
+ - `gradient_checkpointing`: False
268
+ - `gradient_checkpointing_kwargs`: None
269
+ - `include_inputs_for_metrics`: False
270
+ - `include_for_metrics`: []
271
+ - `eval_do_concat_batches`: True
272
+ - `fp16_backend`: auto
273
+ - `push_to_hub_model_id`: None
274
+ - `push_to_hub_organization`: None
275
+ - `mp_parameters`:
276
+ - `auto_find_batch_size`: False
277
+ - `full_determinism`: False
278
+ - `torchdynamo`: None
279
+ - `ray_scope`: last
280
+ - `ddp_timeout`: 1800
281
+ - `torch_compile`: False
282
+ - `torch_compile_backend`: None
283
+ - `torch_compile_mode`: None
284
+ - `dispatch_batches`: None
285
+ - `split_batches`: None
286
+ - `include_tokens_per_second`: False
287
+ - `include_num_input_tokens_seen`: False
288
+ - `neftune_noise_alpha`: None
289
+ - `optim_target_modules`: None
290
+ - `batch_eval_metrics`: False
291
+ - `eval_on_start`: False
292
+ - `use_liger_kernel`: False
293
+ - `eval_use_gather_object`: False
294
+ - `average_tokens_across_devices`: False
295
+ - `prompts`: None
296
+ - `batch_sampler`: batch_sampler
297
+ - `multi_dataset_batch_sampler`: round_robin
298
+
299
+ </details>
300
+
301
+ ### Training Logs
302
+ | Epoch | Step | Training Loss |
303
+ |:-------:|:----:|:-------------:|
304
+ | 6.5789 | 500 | 5.4568 |
305
+ | 13.1579 | 1000 | 0.4965 |
306
+
307
+
308
+ ### Framework Versions
309
+ - Python: 3.10.12
310
+ - Sentence Transformers: 3.4.1
311
+ - Transformers: 4.49.0
312
+ - PyTorch: 2.6.0+cu124
313
+ - Accelerate: 1.4.0
314
+ - Datasets: 3.3.1
315
+ - Tokenizers: 0.21.0
316
+
317
+ ## Citation
318
+
319
+ ### BibTeX
320
+
321
+ #### Sentence Transformers
322
+ ```bibtex
323
+ @inproceedings{reimers-2019-sentence-bert,
324
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
325
+ author = "Reimers, Nils and Gurevych, Iryna",
326
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
327
+ month = "11",
328
+ year = "2019",
329
+ publisher = "Association for Computational Linguistics",
330
+ url = "https://arxiv.org/abs/1908.10084",
331
+ }
332
+ ```
333
+
334
+ #### TripletLoss
335
+ ```bibtex
336
+ @misc{hermans2017defense,
337
+ title={In Defense of the Triplet Loss for Person Re-Identification},
338
+ author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
339
+ year={2017},
340
+ eprint={1703.07737},
341
+ archivePrefix={arXiv},
342
+ primaryClass={cs.CV}
343
+ }
344
+ ```
345
+
346
+ <!--
347
+ ## Glossary
348
+
349
+ *Clearly define terms in order to be accessible across audiences.*
350
+ -->
351
+
352
+ <!--
353
+ ## Model Card Authors
354
+
355
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
356
+ -->
357
+
358
+ <!--
359
+ ## Model Card Contact
360
+
361
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
362
+ -->
config.json ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "sentence-transformers/paraphrase-MiniLM-L6-v2",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "gradient_checkpointing": false,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 384,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 1536,
14
+ "layer_norm_eps": 1e-12,
15
+ "max_position_embeddings": 512,
16
+ "model_type": "bert",
17
+ "num_attention_heads": 12,
18
+ "num_hidden_layers": 6,
19
+ "pad_token_id": 0,
20
+ "position_embedding_type": "absolute",
21
+ "torch_dtype": "float32",
22
+ "transformers_version": "4.49.0",
23
+ "type_vocab_size": 2,
24
+ "use_cache": true,
25
+ "vocab_size": 30522
26
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.4.1",
4
+ "transformers": "4.49.0",
5
+ "pytorch": "2.6.0+cu124"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": "cosine"
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f4866d4531fc54fd40d113422d4508dde4e26eb3d109ee357c9d96fb10fe2a88
3
+ size 90864192
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 128,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,58 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": false,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "extra_special_tokens": {},
49
+ "mask_token": "[MASK]",
50
+ "model_max_length": 128,
51
+ "never_split": null,
52
+ "pad_token": "[PAD]",
53
+ "sep_token": "[SEP]",
54
+ "strip_accents": null,
55
+ "tokenize_chinese_chars": true,
56
+ "tokenizer_class": "BertTokenizer",
57
+ "unk_token": "[UNK]"
58
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff