Lynxpda commited on
Commit
0bf6d6f
·
verified ·
1 Parent(s): b959de9

Upload folder using huggingface_hub

Browse files
2_Dense/pytorch_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:343f53d79815d1296ed1b563314badea919b8e9eaac97f1cda7df6d563da61fa
3
- size 2364028
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a3a940c1d18f7dc625d5117543c14678721061a2f85350ac6973917ee134f5cc
3
+ size 2363964
README.md CHANGED
@@ -8,7 +8,7 @@ tags:
8
 
9
  ---
10
 
11
- # LaBSE-veps
12
 
13
  This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search.
14
 
@@ -28,7 +28,7 @@ Then you can use the model like this:
28
  from sentence_transformers import SentenceTransformer
29
  sentences = ["This is an example sentence", "Each sentence is converted"]
30
 
31
- model = SentenceTransformer('Lynxpda/LaBSE-veps')
32
  embeddings = model.encode(sentences)
33
  print(embeddings)
34
  ```
@@ -47,7 +47,7 @@ The model was trained with the parameters:
47
 
48
  **DataLoader**:
49
 
50
- `torch.utils.data.dataloader.DataLoader` of length 334 with parameters:
51
  ```
52
  {'batch_size': 8, 'sampler': 'torch.utils.data.sampler.RandomSampler', 'batch_sampler': 'torch.utils.data.sampler.BatchSampler'}
53
  ```
@@ -62,17 +62,17 @@ The model was trained with the parameters:
62
  Parameters of the fit()-Method:
63
  ```
64
  {
65
- "epochs": 5,
66
  "evaluation_steps": 100,
67
  "evaluator": "__main__.ChainScoreEvaluator",
68
  "max_grad_norm": 1,
69
  "optimizer_class": "<class 'torch.optim.adamw.AdamW'>",
70
  "optimizer_params": {
71
- "lr": 1e-05
72
  },
73
  "scheduler": "warmupcosine",
74
  "steps_per_epoch": null,
75
- "warmup_steps": 500,
76
  "weight_decay": 0.01
77
  }
78
  ```
 
8
 
9
  ---
10
 
11
+ # {MODEL_NAME}
12
 
13
  This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search.
14
 
 
28
  from sentence_transformers import SentenceTransformer
29
  sentences = ["This is an example sentence", "Each sentence is converted"]
30
 
31
+ model = SentenceTransformer('{MODEL_NAME}')
32
  embeddings = model.encode(sentences)
33
  print(embeddings)
34
  ```
 
47
 
48
  **DataLoader**:
49
 
50
+ `torch.utils.data.dataloader.DataLoader` of length 9636 with parameters:
51
  ```
52
  {'batch_size': 8, 'sampler': 'torch.utils.data.sampler.RandomSampler', 'batch_sampler': 'torch.utils.data.sampler.BatchSampler'}
53
  ```
 
62
  Parameters of the fit()-Method:
63
  ```
64
  {
65
+ "epochs": 2,
66
  "evaluation_steps": 100,
67
  "evaluator": "__main__.ChainScoreEvaluator",
68
  "max_grad_norm": 1,
69
  "optimizer_class": "<class 'torch.optim.adamw.AdamW'>",
70
  "optimizer_params": {
71
+ "lr": 5e-06
72
  },
73
  "scheduler": "warmupcosine",
74
  "steps_per_epoch": null,
75
+ "warmup_steps": 1000,
76
  "weight_decay": 0.01
77
  }
78
  ```
config.json CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "_name_or_path": "Lynxpda/LaBSE-veps",
3
  "architectures": [
4
  "BertModel"
5
  ],
@@ -25,7 +25,7 @@
25
  "pooler_type": "first_token_transform",
26
  "position_embedding_type": "absolute",
27
  "torch_dtype": "float32",
28
- "transformers_version": "4.38.2",
29
  "type_vocab_size": 2,
30
  "use_cache": true,
31
  "vocab_size": 501153
 
1
  {
2
+ "_name_or_path": "sentence-transformers/LaBSE",
3
  "architectures": [
4
  "BertModel"
5
  ],
 
25
  "pooler_type": "first_token_transform",
26
  "position_embedding_type": "absolute",
27
  "torch_dtype": "float32",
28
+ "transformers_version": "4.39.0",
29
  "type_vocab_size": 2,
30
  "use_cache": true,
31
  "vocab_size": 501153
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:abba0b3a86565dc1f2ac9763e3fb4da6762ae12dbc675a81dd810ea0df47866c
3
  size 1883730160
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ad6213cd22a7905cb0bbe03e5f88adb8c0009d1ed34f1cada839a13d43186544
3
  size 1883730160
tokenizer_config.json CHANGED
@@ -47,19 +47,12 @@
47
  "do_lower_case": false,
48
  "full_tokenizer_file": null,
49
  "mask_token": "[MASK]",
50
- "max_length": 256,
51
  "model_max_length": 512,
52
  "never_split": null,
53
- "pad_to_multiple_of": null,
54
  "pad_token": "[PAD]",
55
- "pad_token_type_id": 0,
56
- "padding_side": "right",
57
  "sep_token": "[SEP]",
58
- "stride": 0,
59
  "strip_accents": null,
60
  "tokenize_chinese_chars": true,
61
  "tokenizer_class": "BertTokenizer",
62
- "truncation_side": "right",
63
- "truncation_strategy": "longest_first",
64
  "unk_token": "[UNK]"
65
  }
 
47
  "do_lower_case": false,
48
  "full_tokenizer_file": null,
49
  "mask_token": "[MASK]",
 
50
  "model_max_length": 512,
51
  "never_split": null,
 
52
  "pad_token": "[PAD]",
 
 
53
  "sep_token": "[SEP]",
 
54
  "strip_accents": null,
55
  "tokenize_chinese_chars": true,
56
  "tokenizer_class": "BertTokenizer",
 
 
57
  "unk_token": "[UNK]"
58
  }