Add new SentenceTransformer model.

Browse files

Files changed (12) hide show

1_Pooling/config.json +10 -0
README.md +440 -0
config.json +27 -0
config_sentence_transformers.json +10 -0
merges.txt +0 -0
model.safetensors +3 -0
modules.json +14 -0
sentence_bert_config.json +4 -0
special_tokens_map.json +15 -0
tokenizer.json +0 -0
tokenizer_config.json +57 -0
vocab.json +0 -0

1_Pooling/config.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+  "word_embedding_dimension": 768,
+  "pooling_mode_cls_token": false,
+  "pooling_mode_mean_tokens": true,
+  "pooling_mode_max_tokens": false,
+  "pooling_mode_mean_sqrt_len_tokens": false,
+  "pooling_mode_weightedmean_tokens": false,
+  "pooling_mode_lasttoken": false,
+  "include_prompt": true
+}

README.md ADDED Viewed

	@@ -0,0 +1,440 @@

+---
+base_model: distilbert/distilroberta-base
+datasets: []
+language: []
+library_name: sentence-transformers
+pipeline_tag: sentence-similarity
+tags:
+- sentence-transformers
+- sentence-similarity
+- feature-extraction
+- generated_from_trainer
+- dataset_size:4093
+- loss:TripletLoss
+- loss:ContrastiveLoss
+- loss:CosineSimilarityLoss
+- loss:MultipleNegativesRankingLoss
+widget:
+- source_sentence: I am angry about an incident at the supermarket, and it has been
+    bothering me for the last three or four days. The person involved was flirting
+    with me, and I think they should apologize. I didn't ask for an apology or indicate
+    that I was upset during the conversation because I felt it would be inappropriate.
+    I believe they should call me and set aside an appropriate time to apologize.
+  sentences:
+  - If you are a counsellor, please answer the questions based on the description
+    of the patient.
+  - It seems like this incident really impacted you and you believe that an apology
+    is necessary for closure. It's understandable that you didn't address it on the
+    spot if you felt it was inappropriate. However, people sometimes don't realize
+    they've upset someone unless it's brought to their attention. Since you feel so
+    strongly about this, have you considered reaching out to the person to express
+    your feelings and provide them an opportunity to apologize?
+  - It sounds like you have built a strong support system in San Diego with meaningful
+    relationships and emotional investments. It's natural to feel apprehensive about
+    moving away from that. Maybe you could consider the potential opportunities that
+    LA has to offer and weigh them against what you would be leaving behind in San
+    Diego. Additionally, think about the ways you can stay connected with your loved
+    ones after moving. Just remember that it's a difficult decision, and it's important
+    to give yourself time to make the best choice for you.
+- source_sentence: Lately, I've been feeling a bit overwhelmed with everything going
+    on in my life. I've been hearing things that aren't really there, like whispers
+    and someone calling my name, but it's not happening all the time, just occasionally.
+    I also find myself getting a bit disorganized, but I wouldn't say I'm completely
+    disorganized. Sometimes, I just feel disconnected with what is happening around
+    me. I've been like this for a couple of weeks now. My parents said that I've been
+    acting differently, but I don't know, maybe I'm just overreacting? Can you help
+    me with this?
+  sentences:
+  - If you are a counsellor, please answer the questions based on the description
+    of the patient.
+  - Thank you for sharing your experience with me. I can understand how hearing things
+    and feeling disorganized might be confusing and concerning for you. It's important
+    to discuss these symptoms and feelings in order to provide proper support. Can
+    you tell me if you've experienced any changes in your mood, daily activities,
+    or sleep habits during this time? Also, have you noticed any patterns or triggers
+    for when you hear the whispers or feel disconnected from what's happening around
+    you?
+  - I appreciate you sharing your feelings and experiences with me. It sounds like
+    you're going through a challenging period in your life, dealing with stress and
+    worry. It's important to understand the factors contributing to your stress and
+    how they might be influencing your thoughts and emotions. Could you provide some
+    more specific examples of the situations that have caused you to feel stressed
+    and worried? Additionally, have you noticed any patterns or triggers that seem
+    to make your worries more intense?
+- source_sentence: I've been experiencing anxiety due to the difficult transition
+    from being at home, taking care of my son, to moving back into the work world.
+    I've been trying to manage this anxiety by paying attention to nutrition, exercise,
+    and practicing gratitude. Although I feel like I'm managing better now than when
+    I first started, I'm still struggling.
+  sentences:
+  - It's great to hear that you've made some progress in managing your anxiety through
+    healthy practices like focusing on nutrition, exercise, and gratitude. It's common
+    for individuals to face challenges while transitioning between roles, and it might
+    take more time for you to adjust fully. If you still find yourself struggling,
+    consider seeking additional support through therapy or support groups. Is there
+    a specific aspect of this transition that continues to cause you worry?
+  - If you are a counsellor, please answer the questions based on the description
+    of the patient.
+  - It's great to hear that you have a strong drive for self-improvement and setting
+    ambitious goals for yourself. Acknowledging this inner motivation is an essential
+    step towards achieving what you desire. Remember, it's important to embrace the
+    process of learning and growing, and not to put too much pressure on yourself
+    to reach an end goal quickly. Take it one step at a time, and don't forget to
+    celebrate your milestones along the way.
+- source_sentence: Recently, I have been experiencing difficulties while learning.
+    Sometimes it's quite challenging, and I feel upset with myself to the point of
+    crying. However, at other times, I find it easy and feel a sense of accomplishment
+    when I finally get things right.
+  sentences:
+  - If you are a counsellor, please answer the questions based on the description
+    of the patient.
+  - Thank you for sharing that with me. I'm glad you're feeling better now that you're
+    here. Sometimes waiting can cause anxiety or stress, so it's important to find
+    ways to manage these feelings. If you'd like, we can discuss some coping strategies
+    for dealing with stress while waiting or in similar situations.
+  - It seems like you've been having ups and downs with your learning experience.
+    It's great that you sometimes feel accomplished when you finally understand something.
+    When you're feeling upset or frustrated, it might be helpful to take a break and
+    come back to it later, as you've mentioned. This can give you a fresh perspective
+    and may help reduce stress. It's also important to remember that learning is a
+    process, and it's okay to experience difficulties along the way.
+- source_sentence: I remember generating alternative predictions. We said that I might
+    get nervous but if I stay focused on what I want to do, I can probably do it.
+    It's just hard to believe when I'm in this situation.
+  sentences:
+  - It seems like you've been dealing with unexpected expenses and some impulsive
+    spending habits. By planning for a stricter budget and understanding the difference
+    between essential and non-essential expenses, you can start to develop healthy
+    financial habits. Have you considered creating a budget plan and setting aside
+    a specific amount for unexpected expenses to help manage your spending?
+  - It's normal to feel nervous in difficult situations. As a counsellor, I suggest
+    continuing to test these alternative predictions directly in real-life experiences.
+    As you encounter situations where you manage to handle things even while nervous,
+    your belief in your abilities and the predictions will strengthen. Remember to
+    stay focused and trust yourself.
+  - If you are a counsellor, please answer the questions based on the description
+    of the patient.
+---
+# SentenceTransformer based on distilbert/distilroberta-base
+This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [distilbert/distilroberta-base](https://huggingface.co/distilbert/distilroberta-base). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
+## Model Details
+### Model Description
+- **Model Type:** Sentence Transformer
+- **Base model:** [distilbert/distilroberta-base](https://huggingface.co/distilbert/distilroberta-base) <!-- at revision fb53ab8802853c8e4fbdbcd0529f21fc6f459b2b -->
+- **Maximum Sequence Length:** 512 tokens
+- **Output Dimensionality:** 768 tokens
+- **Similarity Function:** Cosine Similarity
+<!-- - **Training Dataset:** Unknown -->
+<!-- - **Language:** Unknown -->
+<!-- - **License:** Unknown -->
+### Model Sources
+- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
+- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
+- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
+### Full Model Architecture
+```
+SentenceTransformer(
+  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: RobertaModel
+  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
+)
+```
+## Usage
+### Direct Usage (Sentence Transformers)
+First install the Sentence Transformers library:
+```bash
+pip install -U sentence-transformers
+```
+Then you can load this model and run inference.
+```python
+from sentence_transformers import SentenceTransformer
+# Download from the 🤗 Hub
+model = SentenceTransformer("iqrakiran/1my-sentence-transformer-model")
+# Run inference
+sentences = [
+    "I remember generating alternative predictions. We said that I might get nervous but if I stay focused on what I want to do, I can probably do it. It's just hard to believe when I'm in this situation.",
+    "It's normal to feel nervous in difficult situations. As a counsellor, I suggest continuing to test these alternative predictions directly in real-life experiences. As you encounter situations where you manage to handle things even while nervous, your belief in your abilities and the predictions will strengthen. Remember to stay focused and trust yourself.",
+    'If you are a counsellor, please answer the questions based on the description of the patient.',
+]
+embeddings = model.encode(sentences)
+print(embeddings.shape)
+# [3, 768]
+# Get the similarity scores for the embeddings
+similarities = model.similarity(embeddings, embeddings)
+print(similarities.shape)
+# [3, 3]
+```
+<!--
+### Direct Usage (Transformers)
+<details><summary>Click to see the direct usage in Transformers</summary>
+</details>
+-->
+<!--
+### Downstream Usage (Sentence Transformers)
+You can finetune this model on your own dataset.
+<details><summary>Click to expand</summary>
+</details>
+-->
+<!--
+### Out-of-Scope Use
+*List how the model may foreseeably be misused and address what users ought not to do with the model.*
+-->
+<!--
+## Bias, Risks and Limitations
+*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
+-->
+<!--
+### Recommendations
+*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
+-->
+## Training Details
+### Training Dataset
+#### Unnamed Dataset
+* Size: 4,093 training samples
+* Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>sentence_2</code>
+* Approximate statistics based on the first 1000 samples:
+  |         | sentence_0                                                                         | sentence_1                                                                          | sentence_2                                                                        |
+  |:--------|:-----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
+  | type    | string                                                                             | string                                                                              | string                                                                            |
+  | details | <ul><li>min: 12 tokens</li><li>mean: 56.8 tokens</li><li>max: 178 tokens</li></ul> | <ul><li>min: 27 tokens</li><li>mean: 76.32 tokens</li><li>max: 170 tokens</li></ul> | <ul><li>min: 22 tokens</li><li>mean: 22.0 tokens</li><li>max: 22 tokens</li></ul> |
+* Samples:
+  | sentence_0                                                                                                                                                                                                                                                                                                                                   | sentence_1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | sentence_2                                                                                                 |
+  |:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------|
+  | <code>I have been practicing muscle relaxation techniques to calm myself, but I still experience a lot of anxiety. When I'm relaxed and thinking about something calm, it feels nice. However, when I start to think about a negative situation, like people laughing at me, I quickly lose that feeling of relaxation.</code>               | <code>You're on the right track with practicing muscle relaxation, but managing anxiety goes beyond the physical aspects. In social situations, it's important to shift our focus away from anxiety-provoking thoughts and towards the actions we can take in those situations. Remember, relaxation techniques are just one part of managing anxiety, so try to actively redirect your attention towards positive or neutral thoughts when negative scenarios come up.</code>                                                                                                                        | <code>If you are a counsellor, please answer the questions based on the description of the patient.</code> |
+  | <code>As the patient, I feel disappointed and I'm really pissed about my current situation.</code>                                                                                                                                                                                                                                           | <code>As your counsellor, I hear that you are experiencing strong emotions at the moment. It's important for us to identify the specific circumstances that have led to these feelings, so we can better address and challenge the irrational beliefs that might be contributing to your distress. Can you tell me more about the situation that has been bothering you?</code>                                                                                                                                                                                                                       | <code>If you are a counsellor, please answer the questions based on the description of the patient.</code> |
+  | <code>I've been struggling with my relationship with my mom, and it has affected the way I interact with my kids. I vowed not to treat them the way she treated me, but I didn't realize how much it has affected the way I treat myself. I didn't deliberately want to be this way, but I ended up with something like an alter-ego.</code> | <code>It's important to recognize that your relationship with yourself is what we're focusing on right now, even though it was influenced by your relationship with your mom. It's not uncommon for people to unintentionally develop certain traits or behaviors due to past experiences. Acknowledge what you've discovered about yourself and try to gain a better understanding of the factors that contributed to it. Also, reflect on how to improve your relationship with yourself going forward. Have you considered any specific strategies or steps to improve your self-treatment?</code> | <code>If you are a counsellor, please answer the questions based on the description of the patient.</code> |
+* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
+  ```json
+  {
+      "scale": 20.0,
+      "similarity_fct": "cos_sim"
+  }
+  ```
+### Training Hyperparameters
+#### Non-Default Hyperparameters
+- `per_device_train_batch_size`: 16
+- `per_device_eval_batch_size`: 16
+- `num_train_epochs`: 10
+- `multi_dataset_batch_sampler`: round_robin
+#### All Hyperparameters
+<details><summary>Click to expand</summary>
+- `overwrite_output_dir`: False
+- `do_predict`: False
+- `eval_strategy`: no
+- `prediction_loss_only`: True
+- `per_device_train_batch_size`: 16
+- `per_device_eval_batch_size`: 16
+- `per_gpu_train_batch_size`: None
+- `per_gpu_eval_batch_size`: None
+- `gradient_accumulation_steps`: 1
+- `eval_accumulation_steps`: None
+- `torch_empty_cache_steps`: None
+- `learning_rate`: 5e-05
+- `weight_decay`: 0.0
+- `adam_beta1`: 0.9
+- `adam_beta2`: 0.999
+- `adam_epsilon`: 1e-08
+- `max_grad_norm`: 1
+- `num_train_epochs`: 10
+- `max_steps`: -1
+- `lr_scheduler_type`: linear
+- `lr_scheduler_kwargs`: {}
+- `warmup_ratio`: 0.0
+- `warmup_steps`: 0
+- `log_level`: passive
+- `log_level_replica`: warning
+- `log_on_each_node`: True
+- `logging_nan_inf_filter`: True
+- `save_safetensors`: True
+- `save_on_each_node`: False
+- `save_only_model`: False
+- `restore_callback_states_from_checkpoint`: False
+- `no_cuda`: False
+- `use_cpu`: False
+- `use_mps_device`: False
+- `seed`: 42
+- `data_seed`: None
+- `jit_mode_eval`: False
+- `use_ipex`: False
+- `bf16`: False
+- `fp16`: False
+- `fp16_opt_level`: O1
+- `half_precision_backend`: auto
+- `bf16_full_eval`: False
+- `fp16_full_eval`: False
+- `tf32`: None
+- `local_rank`: 0
+- `ddp_backend`: None
+- `tpu_num_cores`: None
+- `tpu_metrics_debug`: False
+- `debug`: []
+- `dataloader_drop_last`: False
+- `dataloader_num_workers`: 0
+- `dataloader_prefetch_factor`: None
+- `past_index`: -1
+- `disable_tqdm`: False
+- `remove_unused_columns`: True
+- `label_names`: None
+- `load_best_model_at_end`: False
+- `ignore_data_skip`: False
+- `fsdp`: []
+- `fsdp_min_num_params`: 0
+- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
+- `fsdp_transformer_layer_cls_to_wrap`: None
+- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
+- `deepspeed`: None
+- `label_smoothing_factor`: 0.0
+- `optim`: adamw_torch
+- `optim_args`: None
+- `adafactor`: False
+- `group_by_length`: False
+- `length_column_name`: length
+- `ddp_find_unused_parameters`: None
+- `ddp_bucket_cap_mb`: None
+- `ddp_broadcast_buffers`: False
+- `dataloader_pin_memory`: True
+- `dataloader_persistent_workers`: False
+- `skip_memory_metrics`: True
+- `use_legacy_prediction_loop`: False
+- `push_to_hub`: False
+- `resume_from_checkpoint`: None
+- `hub_model_id`: None
+- `hub_strategy`: every_save
+- `hub_private_repo`: False
+- `hub_always_push`: False
+- `gradient_checkpointing`: False
+- `gradient_checkpointing_kwargs`: None
+- `include_inputs_for_metrics`: False
+- `eval_do_concat_batches`: True
+- `fp16_backend`: auto
+- `push_to_hub_model_id`: None
+- `push_to_hub_organization`: None
+- `mp_parameters`:
+- `auto_find_batch_size`: False
+- `full_determinism`: False
+- `torchdynamo`: None
+- `ray_scope`: last
+- `ddp_timeout`: 1800
+- `torch_compile`: False
+- `torch_compile_backend`: None
+- `torch_compile_mode`: None
+- `dispatch_batches`: None
+- `split_batches`: None
+- `include_tokens_per_second`: False
+- `include_num_input_tokens_seen`: False
+- `neftune_noise_alpha`: None
+- `optim_target_modules`: None
+- `batch_eval_metrics`: False
+- `eval_on_start`: False
+- `eval_use_gather_object`: False
+- `batch_sampler`: batch_sampler
+- `multi_dataset_batch_sampler`: round_robin
+</details>
+### Training Logs
+| Epoch  | Step | Training Loss |
+|:------:|:----:|:-------------:|
+| 1.9531 | 500  | 0.2321        |
+| 3.9062 | 1000 | 0.0           |
+| 5.8594 | 1500 | 0.0001        |
+| 7.8125 | 2000 | 0.0           |
+| 9.7656 | 2500 | 0.0           |
+| 1.9531 | 500  | 0.4353        |
+| 3.9062 | 1000 | 0.0119        |
+| 5.8594 | 1500 | 0.0022        |
+| 7.8125 | 2000 | 0.0009        |
+| 9.7656 | 2500 | 0.0007        |
+### Framework Versions
+- Python: 3.10.12
+- Sentence Transformers: 3.0.1
+- Transformers: 4.44.2
+- PyTorch: 2.4.0+cu121
+- Accelerate: 0.33.0
+- Datasets: 2.21.0
+- Tokenizers: 0.19.1
+## Citation
+### BibTeX
+#### Sentence Transformers
+```bibtex
+@inproceedings{reimers-2019-sentence-bert,
+    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
+    author = "Reimers, Nils and Gurevych, Iryna",
+    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
+    month = "11",
+    year = "2019",
+    publisher = "Association for Computational Linguistics",
+    url = "https://arxiv.org/abs/1908.10084",
+}
+```
+#### MultipleNegativesRankingLoss
+```bibtex
+@misc{henderson2017efficient,
+    title={Efficient Natural Language Response Suggestion for Smart Reply},
+    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
+    year={2017},
+    eprint={1705.00652},
+    archivePrefix={arXiv},
+    primaryClass={cs.CL}
+}
+```
+<!--
+## Glossary
+*Clearly define terms in order to be accessible across audiences.*
+-->
+<!--
+## Model Card Authors
+*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
+-->
+<!--
+## Model Card Contact
+*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
+-->

config.json ADDED Viewed

	@@ -0,0 +1,27 @@

+{
+  "_name_or_path": "distilroberta-base",
+  "architectures": [
+    "RobertaModel"
+  ],
+  "attention_probs_dropout_prob": 0.1,
+  "bos_token_id": 0,
+  "classifier_dropout": null,
+  "eos_token_id": 2,
+  "hidden_act": "gelu",
+  "hidden_dropout_prob": 0.1,
+  "hidden_size": 768,
+  "initializer_range": 0.02,
+  "intermediate_size": 3072,
+  "layer_norm_eps": 1e-05,
+  "max_position_embeddings": 514,
+  "model_type": "roberta",
+  "num_attention_heads": 12,
+  "num_hidden_layers": 6,
+  "pad_token_id": 1,
+  "position_embedding_type": "absolute",
+  "torch_dtype": "float32",
+  "transformers_version": "4.44.2",
+  "type_vocab_size": 1,
+  "use_cache": true,
+  "vocab_size": 50265
+}

config_sentence_transformers.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+  "__version__": {
+    "sentence_transformers": "3.0.1",
+    "transformers": "4.44.2",
+    "pytorch": "2.4.0+cu121"
+  },
+  "prompts": {},
+  "default_prompt_name": null,
+  "similarity_fn_name": null
+}

merges.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:bf472ad31c5a34e5fc835d368cb9051a1d07f6d548a4846d327f5d46c6631f2f
+size 328485128

modules.json ADDED Viewed

	@@ -0,0 +1,14 @@

+[
+  {
+    "idx": 0,
+    "name": "0",
+    "path": "",
+    "type": "sentence_transformers.models.Transformer"
+  },
+  {
+    "idx": 1,
+    "name": "1",
+    "path": "1_Pooling",
+    "type": "sentence_transformers.models.Pooling"
+  }
+]

sentence_bert_config.json ADDED Viewed

	@@ -0,0 +1,4 @@

+{
+  "max_seq_length": 512,
+  "do_lower_case": false
+}

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,15 @@

+{
+  "bos_token": "<s>",
+  "cls_token": "<s>",
+  "eos_token": "</s>",
+  "mask_token": {
+    "content": "<mask>",
+    "lstrip": true,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": "<pad>",
+  "sep_token": "</s>",
+  "unk_token": "<unk>"
+}

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,57 @@

+{
+  "add_prefix_space": false,
+  "added_tokens_decoder": {
+    "0": {
+      "content": "<s>",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "1": {
+      "content": "<pad>",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "2": {
+      "content": "</s>",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "3": {
+      "content": "<unk>",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "50264": {
+      "content": "<mask>",
+      "lstrip": true,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "bos_token": "<s>",
+  "clean_up_tokenization_spaces": true,
+  "cls_token": "<s>",
+  "eos_token": "</s>",
+  "errors": "replace",
+  "mask_token": "<mask>",
+  "model_max_length": 512,
+  "pad_token": "<pad>",
+  "sep_token": "</s>",
+  "tokenizer_class": "RobertaTokenizer",
+  "trim_offsets": true,
+  "unk_token": "<unk>"
+}

vocab.json ADDED Viewed

The diff for this file is too large to render. See raw diff