Add new SentenceTransformer model.
Browse files- README.md +142 -38
- model.safetensors +1 -1
README.md
CHANGED
@@ -6,40 +6,43 @@ tags:
|
|
6 |
- sentence-similarity
|
7 |
- feature-extraction
|
8 |
- generated_from_trainer
|
9 |
-
- dataset_size:
|
10 |
- loss:CosineSimilarityLoss
|
11 |
base_model: sentence-transformers/all-MiniLM-L6-v2
|
12 |
datasets: []
|
13 |
widget:
|
14 |
-
- source_sentence:
|
|
|
15 |
sentences:
|
16 |
-
-
|
17 |
-
|
18 |
-
- He
|
19 |
-
-
|
20 |
-
|
21 |
-
- source_sentence: Spinal locks and cervical locks are allowed and mandatory in IBJJF
|
22 |
-
Brazilian jiu-jitsu competitions.
|
23 |
sentences:
|
24 |
-
-
|
25 |
-
-
|
26 |
-
-
|
27 |
-
|
|
|
|
|
28 |
sentences:
|
29 |
-
-
|
30 |
-
|
31 |
-
-
|
32 |
-
-
|
|
|
|
|
33 |
sentences:
|
34 |
-
-
|
35 |
-
-
|
36 |
-
|
37 |
-
|
38 |
-
- source_sentence: The wings are diffuse with scales.
|
39 |
sentences:
|
40 |
-
-
|
41 |
-
-
|
42 |
-
-
|
|
|
43 |
pipeline_tag: sentence-similarity
|
44 |
---
|
45 |
|
@@ -93,9 +96,9 @@ from sentence_transformers import SentenceTransformer
|
|
93 |
model = SentenceTransformer("LeoChiuu/all-MiniLM-L6-v2-negations")
|
94 |
# Run inference
|
95 |
sentences = [
|
96 |
-
'
|
97 |
-
'
|
98 |
-
|
99 |
]
|
100 |
embeddings = model.encode(sentences)
|
101 |
print(embeddings.shape)
|
@@ -150,19 +153,19 @@ You can finetune this model on your own dataset.
|
|
150 |
#### Unnamed Dataset
|
151 |
|
152 |
|
153 |
-
* Size:
|
154 |
* Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>label</code>
|
155 |
* Approximate statistics based on the first 1000 samples:
|
156 |
-
| | sentence_0
|
157 |
-
|
158 |
-
| type | string
|
159 |
-
| details | <ul><li>min:
|
160 |
* Samples:
|
161 |
-
| sentence_0
|
162 |
-
|
163 |
-
| <code>
|
164 |
-
| <code>
|
165 |
-
| <code>
|
166 |
* Loss: [<code>CosineSimilarityLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosinesimilarityloss) with these parameters:
|
167 |
```json
|
168 |
{
|
@@ -289,6 +292,107 @@ You can finetune this model on your own dataset.
|
|
289 |
|
290 |
</details>
|
291 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
292 |
### Framework Versions
|
293 |
- Python: 3.11.9
|
294 |
- Sentence Transformers: 3.0.1
|
|
|
6 |
- sentence-similarity
|
7 |
- feature-extraction
|
8 |
- generated_from_trainer
|
9 |
+
- dataset_size:77376
|
10 |
- loss:CosineSimilarityLoss
|
11 |
base_model: sentence-transformers/all-MiniLM-L6-v2
|
12 |
datasets: []
|
13 |
widget:
|
14 |
+
- source_sentence: He has published several books on nutrition, trace metals but not
|
15 |
+
biochemistry imbalances.
|
16 |
sentences:
|
17 |
+
- This in turn can help in effective communication between healthcare providers
|
18 |
+
and their patients.
|
19 |
+
- He has written several books on nutrition, trace metals, and biochemistry imbalances.
|
20 |
+
- One of the most boring movies I have ever seen.
|
21 |
+
- source_sentence: She was denied the 2011 NSK Neustadt Prize for Children's Literature.
|
|
|
|
|
22 |
sentences:
|
23 |
+
- She was the recipient of the 2011 NSK Neustadt Prize for Children's Literature.
|
24 |
+
- The ancient woodland at Dickshills is also located here.
|
25 |
+
- An element (such as a tree) that contributes to evapotranspiration can be called
|
26 |
+
an evapotranspirator.
|
27 |
+
- source_sentence: Viking, after the resemblance the pitchers bear to the prow of
|
28 |
+
a Viking ship.
|
29 |
sentences:
|
30 |
+
- Viking, after the striking difference the pitchers bear to the prow of a Viking
|
31 |
+
ship.
|
32 |
+
- Honshu is formed from the island arcs.
|
33 |
+
- For instance, even alcohol consumption by a pregnant woman is unable to lead to
|
34 |
+
fetal alcohol syndrome.
|
35 |
+
- source_sentence: Logging has not been undertake near the headwaters of the creek.
|
36 |
sentences:
|
37 |
+
- Then I had to continue pairing it periodically since it somehow kept dropping.
|
38 |
+
- That's fair, Nance.
|
39 |
+
- Logging has been done near the headwaters of the creek.
|
40 |
+
- source_sentence: He published a history of Cornwall, New York in 1873.
|
|
|
41 |
sentences:
|
42 |
+
- He failed to publish a history of Cornwall, New York in 1873.
|
43 |
+
- Salafis assert that reliance on taqlid has led to Islam 's decline.
|
44 |
+
- 'Lot of holes in the plot: there''s nothing about how he became the emperor; nothing
|
45 |
+
about where he spend 20 years between his childhood and mature age.'
|
46 |
pipeline_tag: sentence-similarity
|
47 |
---
|
48 |
|
|
|
96 |
model = SentenceTransformer("LeoChiuu/all-MiniLM-L6-v2-negations")
|
97 |
# Run inference
|
98 |
sentences = [
|
99 |
+
'He published a history of Cornwall, New York in 1873.',
|
100 |
+
'He failed to publish a history of Cornwall, New York in 1873.',
|
101 |
+
"Salafis assert that reliance on taqlid has led to Islam 's decline.",
|
102 |
]
|
103 |
embeddings = model.encode(sentences)
|
104 |
print(embeddings.shape)
|
|
|
153 |
#### Unnamed Dataset
|
154 |
|
155 |
|
156 |
+
* Size: 77,376 training samples
|
157 |
* Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>label</code>
|
158 |
* Approximate statistics based on the first 1000 samples:
|
159 |
+
| | sentence_0 | sentence_1 | label |
|
160 |
+
|:--------|:---------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:------------------------------------------------|
|
161 |
+
| type | string | string | int |
|
162 |
+
| details | <ul><li>min: 6 tokens</li><li>mean: 16.2 tokens</li><li>max: 57 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 16.32 tokens</li><li>max: 56 tokens</li></ul> | <ul><li>0: ~53.20%</li><li>1: ~46.80%</li></ul> |
|
163 |
* Samples:
|
164 |
+
| sentence_0 | sentence_1 | label |
|
165 |
+
|:--------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------|:---------------|
|
166 |
+
| <code>The situation in Yemen was already much better than it was in Bahrain.</code> | <code>The situation in Yemen was not much better than Bahrain.</code> | <code>0</code> |
|
167 |
+
| <code>She was a member of the Gamma Theta Upsilon honour society of geography.</code> | <code>She was denied membership of the Gamma Theta Upsilon honour society of mathematics.</code> | <code>0</code> |
|
168 |
+
| <code>Which aren't small and not worth the price.</code> | <code>Which are small and not worth the price.</code> | <code>0</code> |
|
169 |
* Loss: [<code>CosineSimilarityLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosinesimilarityloss) with these parameters:
|
170 |
```json
|
171 |
{
|
|
|
292 |
|
293 |
</details>
|
294 |
|
295 |
+
### Training Logs
|
296 |
+
| Epoch | Step | Training Loss |
|
297 |
+
|:------:|:-----:|:-------------:|
|
298 |
+
| 0.1034 | 500 | 0.3382 |
|
299 |
+
| 0.2068 | 1000 | 0.2112 |
|
300 |
+
| 0.3102 | 1500 | 0.1649 |
|
301 |
+
| 0.4136 | 2000 | 0.1454 |
|
302 |
+
| 0.5170 | 2500 | 0.1244 |
|
303 |
+
| 0.6203 | 3000 | 0.1081 |
|
304 |
+
| 0.7237 | 3500 | 0.0962 |
|
305 |
+
| 0.8271 | 4000 | 0.0924 |
|
306 |
+
| 0.9305 | 4500 | 0.0852 |
|
307 |
+
| 1.0339 | 5000 | 0.0812 |
|
308 |
+
| 1.1373 | 5500 | 0.0833 |
|
309 |
+
| 1.2407 | 6000 | 0.0736 |
|
310 |
+
| 1.3441 | 6500 | 0.0756 |
|
311 |
+
| 1.4475 | 7000 | 0.0665 |
|
312 |
+
| 1.5509 | 7500 | 0.0661 |
|
313 |
+
| 1.6543 | 8000 | 0.0625 |
|
314 |
+
| 1.7577 | 8500 | 0.0621 |
|
315 |
+
| 1.8610 | 9000 | 0.0593 |
|
316 |
+
| 1.9644 | 9500 | 0.054 |
|
317 |
+
| 2.0678 | 10000 | 0.0569 |
|
318 |
+
| 2.1712 | 10500 | 0.0566 |
|
319 |
+
| 2.2746 | 11000 | 0.0502 |
|
320 |
+
| 2.3780 | 11500 | 0.0516 |
|
321 |
+
| 2.4814 | 12000 | 0.0455 |
|
322 |
+
| 2.5848 | 12500 | 0.0454 |
|
323 |
+
| 2.6882 | 13000 | 0.0424 |
|
324 |
+
| 2.7916 | 13500 | 0.044 |
|
325 |
+
| 2.8950 | 14000 | 0.0376 |
|
326 |
+
| 2.9983 | 14500 | 0.0386 |
|
327 |
+
| 3.1017 | 15000 | 0.0392 |
|
328 |
+
| 3.2051 | 15500 | 0.0344 |
|
329 |
+
| 3.3085 | 16000 | 0.0348 |
|
330 |
+
| 3.4119 | 16500 | 0.0343 |
|
331 |
+
| 3.5153 | 17000 | 0.0322 |
|
332 |
+
| 3.6187 | 17500 | 0.0324 |
|
333 |
+
| 3.7221 | 18000 | 0.0278 |
|
334 |
+
| 3.8255 | 18500 | 0.0294 |
|
335 |
+
| 3.9289 | 19000 | 0.0292 |
|
336 |
+
| 4.0323 | 19500 | 0.0276 |
|
337 |
+
| 4.1356 | 20000 | 0.0285 |
|
338 |
+
| 4.2390 | 20500 | 0.026 |
|
339 |
+
| 4.3424 | 21000 | 0.0271 |
|
340 |
+
| 4.4458 | 21500 | 0.0248 |
|
341 |
+
| 4.5492 | 22000 | 0.0245 |
|
342 |
+
| 4.6526 | 22500 | 0.0253 |
|
343 |
+
| 4.7560 | 23000 | 0.022 |
|
344 |
+
| 4.8594 | 23500 | 0.0219 |
|
345 |
+
| 4.9628 | 24000 | 0.0207 |
|
346 |
+
| 5.0662 | 24500 | 0.0212 |
|
347 |
+
| 5.1696 | 25000 | 0.0218 |
|
348 |
+
| 5.2730 | 25500 | 0.0192 |
|
349 |
+
| 5.3763 | 26000 | 0.0198 |
|
350 |
+
| 5.4797 | 26500 | 0.0183 |
|
351 |
+
| 5.5831 | 27000 | 0.02 |
|
352 |
+
| 5.6865 | 27500 | 0.0176 |
|
353 |
+
| 5.7899 | 28000 | 0.0184 |
|
354 |
+
| 5.8933 | 28500 | 0.0157 |
|
355 |
+
| 5.9967 | 29000 | 0.0175 |
|
356 |
+
| 6.1001 | 29500 | 0.0175 |
|
357 |
+
| 6.2035 | 30000 | 0.0163 |
|
358 |
+
| 6.3069 | 30500 | 0.0173 |
|
359 |
+
| 6.4103 | 31000 | 0.0165 |
|
360 |
+
| 6.5136 | 31500 | 0.0152 |
|
361 |
+
| 6.6170 | 32000 | 0.0155 |
|
362 |
+
| 6.7204 | 32500 | 0.0132 |
|
363 |
+
| 6.8238 | 33000 | 0.0147 |
|
364 |
+
| 6.9272 | 33500 | 0.0145 |
|
365 |
+
| 7.0306 | 34000 | 0.014 |
|
366 |
+
| 7.1340 | 34500 | 0.0147 |
|
367 |
+
| 7.2374 | 35000 | 0.0126 |
|
368 |
+
| 7.3408 | 35500 | 0.0141 |
|
369 |
+
| 7.4442 | 36000 | 0.0127 |
|
370 |
+
| 7.5476 | 36500 | 0.0132 |
|
371 |
+
| 7.6510 | 37000 | 0.0125 |
|
372 |
+
| 7.7543 | 37500 | 0.0111 |
|
373 |
+
| 7.8577 | 38000 | 0.011 |
|
374 |
+
| 7.9611 | 38500 | 0.0125 |
|
375 |
+
| 8.0645 | 39000 | 0.0128 |
|
376 |
+
| 8.1679 | 39500 | 0.013 |
|
377 |
+
| 8.2713 | 40000 | 0.0115 |
|
378 |
+
| 8.3747 | 40500 | 0.0111 |
|
379 |
+
| 8.4781 | 41000 | 0.0108 |
|
380 |
+
| 8.5815 | 41500 | 0.012 |
|
381 |
+
| 8.6849 | 42000 | 0.0108 |
|
382 |
+
| 8.7883 | 42500 | 0.0105 |
|
383 |
+
| 8.8916 | 43000 | 0.0092 |
|
384 |
+
| 8.9950 | 43500 | 0.0115 |
|
385 |
+
| 9.0984 | 44000 | 0.0112 |
|
386 |
+
| 9.2018 | 44500 | 0.0096 |
|
387 |
+
| 9.3052 | 45000 | 0.0106 |
|
388 |
+
| 9.4086 | 45500 | 0.011 |
|
389 |
+
| 9.5120 | 46000 | 0.01 |
|
390 |
+
| 9.6154 | 46500 | 0.011 |
|
391 |
+
| 9.7188 | 47000 | 0.0097 |
|
392 |
+
| 9.8222 | 47500 | 0.0096 |
|
393 |
+
| 9.9256 | 48000 | 0.0102 |
|
394 |
+
|
395 |
+
|
396 |
### Framework Versions
|
397 |
- Python: 3.11.9
|
398 |
- Sentence Transformers: 3.0.1
|
model.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 90864192
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:2a07b72204ce2d0731a19bf791dcf150885596166477cc4f82599c32fbf07f63
|
3 |
size 90864192
|