knguyennguyen commited on
Commit
0eb7b54
·
verified ·
1 Parent(s): 818651c

Add new SentenceTransformer model.

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,494 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - sentence-transformers
4
+ - sentence-similarity
5
+ - feature-extraction
6
+ - generated_from_trainer
7
+ - dataset_size:11397
8
+ - loss:MultipleNegativesRankingLoss
9
+ base_model: sentence-transformers/all-mpnet-base-v2
10
+ widget:
11
+ - source_sentence: men's sleeveless vest with a polished exterior and a tailored fit..
12
+ men's sleeveless vest with a polished exterior and a tailored fit.
13
+ sentences:
14
+ - 'Title: Arnodefrance Lity Of Gog Denim Jacket Graphic Print Washed Jacket Hip
15
+ Pop Button Down Trucker Jacket Descripion: [''Arnodefrance provides more trendy
16
+ clothing choices for trendy brand lovers and fashion icons. It has always been
17
+ aimed at creating an international first-line trendy brand. It has unique cutting
18
+ treatment, personalized color matching and comfortable soft fabrics. It expresses
19
+ modern youth through clothing design. In a happy world, people play a self-style,
20
+ create topics, and always maintain a trendy attitude to question common sense
21
+ and pursue their own answers.'']'
22
+ - 'Title: Columbia Girls'' Big Benton Fleece Jacket, Spring Blue/Blue Chill, Medium
23
+ Descripion: ["There''s nothing more necessary than a fleece layer in a litter
24
+ adventurer''s outdoor winter wardrobe—that''s why the Benton Springs Full Zip
25
+ Fleece Jacket exists. Columbia''s soft, winter-ready jacket is the ultimate warmth
26
+ provider and the everyday style piece. Crafted of our super-soft 100% polyester
27
+ MTR filament fleece, this Benton Springs Full Zip Fleece Jacket is the perfect
28
+ layering piece and first line of defense to combat the cold. It contains a modern
29
+ classic fit that allows for comfortable movement and zippered side pockets to
30
+ keep your small items (including your hands) secure. An added bonus is the warm
31
+ collar that''s flexible so you can choose whether you want to wear it up or down,
32
+ depending on your desired level of toastiness. Our Benton Springs Full Zip Fleece
33
+ Jacket is available in many accommodating sizes and colors as well. To ensure
34
+ the size you choose is right, utilize our sizing chart and the following measurement
35
+ instructions: For the sleeves, start at the center back of your neck and measure
36
+ across the shoulder and down to the sleeve. If you come up with a partial number,
37
+ round up to the next even number. For the chest, measure at the fullest part of
38
+ the chest, under the armpits and over the shoulder blades, keeping the tape measure
39
+ firm and level."]'
40
+ - 'Title: Men''s Slim Vest Sleeveless Jacket Casual PU Leather Vests Button Open
41
+ V-Neck Simple Joker Slim Fit Vest Winter Descripion: [''SPECIFICATIONGender:MENFabric
42
+ Type:BroadclothStyle:Smart CasualMaterial:NylonMaterial:ViscoseItem Type:Vests'']'
43
+ - source_sentence: women's blazer with a tailored design, long sleeves, and a single-button
44
+ closure.
45
+ sentences:
46
+ - "Title: Blazer Jackets for Women Lapel Long Sleeve Single Breasted Office Outerwear\
47
+ \ Solid Casual Long Coats Work Cardigans Descripion: ['☆☆☆☆☆☆▅▅▅▅▅▅▅▅▅▅' '☆☆☆☆☆☆▅▅▅▅▅▅▅▅▅▅'\
48
+ \ '☆☆☆☆☆☆▅▅▅▅▅▅▅▅▅▅'\n '▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅' '▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅' '▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅'\
49
+ \ 'Q&A'\n 'Q1:Are these Anjikang store clothes true to size?A1:Yes, just order\
50
+ \ your size,we are standard US size.'\n \"Q2: Will it wrinkle or shrink after\
51
+ \ washed? or does it smell bad or itchy?A2: Not at all. We made of good material,so\
52
+ \ it won't had bad smelling,shrink or wrinkle,itchy all the problem you are worried.\"\
53
+ \n 'Q3: Does it look exactly like the picture?A3: Yes the color is the same as\
54
+ \ in the picture.'\n 'Q4: Washing instructions?A4: Hand wash recommended; Machine\
55
+ \ wash cold.'\n 'Q5: Does this material fade fast?A5: Not at all.'\n 'Q6: Is this\
56
+ \ soft or a rougher material?A6: Very soft and comfortable.'\n \"zip up jacket\
57
+ \ women womens dress coat long coats for women velvet coat winter sweaters shacket\
58
+ \ jacket women black varsity jacket bomber jackets flannel jackets rain jacket\
59
+ \ women varsity jacket women jean jacket with fringe jean jacket women hooded\
60
+ \ jean jacket velvet jackets water resistant jacket women cropped zip up jacket\
61
+ \ fleece tights for women winter womens plaid jacket denim long denim jackets\
62
+ \ plus size faux leather jacket women coats and jackets trench coats for women\
63
+ \ yellow plaid jacket women's windbreaker jackets camouflage jacket for women\
64
+ \ woman puffer jacket scrub jacket for women for nurses white suit jacket womens\
65
+ \ white cropped slim athletic yoga workout track sports zip up jacket flannel\
66
+ \ jacket white bomber jacket womens aztec denim jacket black parade jacket black\
67
+ \ winter coat women womens down jacket fall jacket women lightweight puffer jacket\
68
+ \ women black sequin jacket cape coat long plaid jacket women purple suit jacket\
69
+ \ white jackets for women\"]"
70
+ - 'Title: Mother of The Bride Dresses with Jacket Lace Wedding Guest Dresses for
71
+ Women Maxi Long Formal Evening Dress Descripion: [''Mother of the bride dresess
72
+ chiffon evening dress formal evening party dresses a line mother of the bride
73
+ dress'']'
74
+ - "Title: 5665 Teen Girls Cape Coat for Women Long Plus Size Winter Warm Coat Button\
75
+ \ Thick Wool Peacoat Black Fleece Jacket S-5XL Descripion: ['♔Welcome to Our Store♔'\n\
76
+ \ '(◕ˇ∀ˇ◕) Have a nice shopping time, thank you so much! (›´ω`‹ )'\n '-----Size\
77
+ \ Note-----'\n 'Runs Small. We suggest buy one or two size larger. Thank you.'\n\
78
+ \ 'Please check the Size Chart before order. If you are not sure the size, please\
79
+ \ send message to us.Have a nice day!'\n \"Size.: Small US: 4 UK: 8 EU: 34 Bust:\
80
+ \ 101cm/39.76'' Shoulder: 67.5cm/26.57'' Sleeve: 41.5cm/16.34'' Length: 75cm/29.53''\"\
81
+ \n \"Size.: Medium US: 6 UK: 10 EU: 36 Bust: 106cm/41.73'' Shoulder: 70cm/27.56''\
82
+ \ Sleeve: 42cm/16.54'' Length: 76cm/29.92''\"\n \"Size.: Large US: 8 UK: 12 EU:\
83
+ \ 38 Bust: 111cm/43.70'' Shoulder: 72.5cm/28.54'' Sleeve: 42.5cm/16.73'' Length:\
84
+ \ 77cm/30.31''\"\n \"Size.: X-Large US: 10 UK: 14 EU: 40 Bust: 116cm/45.67'' Shoulder:\
85
+ \ 75cm/29.53'' Sleeve: 43cm/16.93'' Length: 78cm/30.71''\"\n \"Size.: XX-Large\
86
+ \ US: 12 UK: 16 EU: 42 Bust: 121cm/47.64'' Shoulder: 77.5cm/30.51'' Sleeve: 43.5cm/17.13''\
87
+ \ Length: 79cm/31.10''\"\n \"Size.: XXX-Large US: 14 UK: 18 EU: 44 Bust: 126cm/49.61''\
88
+ \ Shoulder: 80cm/31.50'' Sleeve: 44cm/17.32'' Length: 80cm/31.50''\"\n \"Size.:\
89
+ \ XXXX-Large US: 16 UK: 20 EU: 46 Bust: 131cm/51.57'' Shoulder: 82.5cm/32.48''\
90
+ \ Sleeve: 44.5cm/17.52'' Length: 81cm/31.89''\"\n \"Size.: XXXXX-Large US: 18\
91
+ \ UK: 21 EU: 48 Bust: 136cm/53.54'' Shoulder: 85cm/33.46'' Sleeve: 45cm/17.72''\
92
+ \ Length: 82cm/32.28''\"]"
93
+ - source_sentence: men's tracksuit set featuring a hood, zip closure, and a comfortable
94
+ fit with breathable fabric.
95
+ sentences:
96
+ - 'Title: INTL d.e.t.a.i.l.s Women''s Plus Size Packable Anorak Jacket Descripion:
97
+ [''This plus size packable anorak jacket from Details is the perfect addition
98
+ to your outerwear wardrobe. This is great for transitional seasons or collar spring/summer
99
+ days or nights.'']'
100
+ - "Title: Men's Linen Suits 2 Pieces Slim Fit Prom Suit Summer Beach Wedding Groomsman\
101
+ \ Jacket Pants Set Descripion: [\"Men's 2 Pieces Linen Suit Slim Fit Casual Summer\
102
+ \ Beach Suits for Men Formal Wedding Prom Business Tuxedo\"\n '● This suit contain\
103
+ \ 1 blazer, 1 pants● Selected High-quality Fabrics: Cotton, Polyester, Viscose.\
104
+ \ Selected Comfortable, Soft, Breathable Fabrics● Style: Classic Design, Slim\
105
+ \ Fit● Multi-Colors Optional: Provide Customized Colors'\n 'Slim Fit 3D Cut Blazer\
106
+ \ with Full Shoulder Design:'\n '2 Buttons Closure, Notch Lapel, 4 Pockets on\
107
+ \ front, 1 Vent'\n 'Strong and Durable Pants with Adjustable Waist:'\n 'Flat Front,\
108
+ \ Adjustable Waist Band' 'IMPORTANT TIPS About Size'\n '● Our Size: XS≈34R, S≈36R,\
109
+ \ M≈38R, L≈40R, XL≈42R, XXL≈44R, 3XL≈46R(The size analogy is for reference only,\
110
+ \ please check our size chart for the actual size)● PLEASE NOT look at the Amazon\
111
+ \ size chart. Please select \" customized color and size\" option if you need\
112
+ \ customized, then send us all measurements that listed below (The measurement\
113
+ \ guide pictures are in the left picture of the customization option)● Customized\
114
+ \ Size (units CM orinches): 1.Neckline 2.Shoulder to shoulder 3.Arm Length 4.Bicep\
115
+ \ 5.Cuff 6.Chest 7.Belly 8.Waist 9.Hips 10.Blazer Length 11.Pants Length 12.Thigh\
116
+ \ 13.Height 14.Weight=_kg or pounds● Please make sure all body measurements are\
117
+ \ correct, please feel free to contact us if you need help'\n 'Easy to Match with\
118
+ \ and Suitable for a lot of Occasions:'\n '● You can match with shirt, tie. You\
119
+ \ can also match with a solid color T-shirt, simple and comfortable● Suitable\
120
+ \ for Wedding, Business, Party, Many other occasions, also a great gift for someone\
121
+ \ important'\n 'With the cut somewhat narrow at the waist and legs, looks trendy,\
122
+ \ don’t be worry about the fit of the suit, you can enjoy the freedom of movement\
123
+ \ at the same time. Make yourself a modern and trimmed-down silhouette with this\
124
+ \ suit set, it will bring you tons of compliments!']"
125
+ - 'Title: JG JENNY GHOO Men''s Casual Tracksuits Long Sleeve Jogging Suits Sweatsuit
126
+ Sets Track Jackets and Pants 2 Piece Outfit Descripion: ["men''s tracksuits track
127
+ suits for men hip hop sweatsuits jogging suits sets 2 piece Warm and breathable
128
+ material. Great for everyday wear and for sport. This tracksuit has a soft and
129
+ breathable material and it is suitable for any occasion. It has a hood and zippers
130
+ and it is available in different colors and patterns."]'
131
+ - source_sentence: a lightweight jacket for casual wear
132
+ sentences:
133
+ - 'Title: Umbro Brentford FC Mens 22/23 Presentation Jacket (L) (Black/Carbon) Descripion:
134
+ [''Fabric: French Terry, Stretch, Woven. Design: Crest, Logo. Angular Panels,
135
+ Branded Zip Pull, Inner Zip Guard, Side Panels. Fabric Technology: Lightweight.
136
+ Sleeve-Type: Long-Sleeved. Neckline: Standing Collar. Pockets: 2 Side Pockets,
137
+ Concealed Zip. Fastening: Full Zip. Hem: Clean Cut. 100% Officially Licensed.'']'
138
+ - 'Title: AKNHD Baby Boys Girls Hooded Thick Snowsuit Romper Warm Snowsuit Coat
139
+ Outwear Jacket Snowsuit with Gloves Descripion: ["Product Description:Fashion
140
+ design,100% Brand New,high quality!Material: PolyesterPattern Type: SolidSleeve
141
+ length: Long SleeveMain Color: As The Picture ShowStyle: FashionStylish and fashion
142
+ design make your baby more attractiveGreat for casual, Daily, party or photoshoot,
143
+ also a great idea for a baby show giftsIt is made of high quality materials,Soft
144
+ hand feeling, no any harm to your baby''s skinPlease allow slight 1-3cm difference
145
+ due to manual measurement and a little color variation for different display setting,thanks
146
+ for your understanding!1 inch = 2.54 cmThank you and nice day!Package include:1PC
147
+ Romper+1Pair Gloves/1PC Romper"]'
148
+ - 'Title: Obermeyer Girls'' Katelyn Jacket Without FA Descripion: ["Our newly styled
149
+ Katelyn is a luxurious jacket; Technical, sophisticated, and dependable for any
150
+ endeavor. Children have no filters, They say what''s on their minds.They simply
151
+ go and explore how things are, how they work, what they do. We love and encourage
152
+ them to navigate their surroundings. Winter brings an excitement that opens her
153
+ curiosity; The uniqueness of snowflakes entices us all, and for her, discovery
154
+ and adventure."]'
155
+ - source_sentence: enamel pin with a compact size, durable material, and a secure
156
+ backing.. enamel pin with a compact size, durable material, and a secure backing.
157
+ sentences:
158
+ - "Title: Bleaches Kurosak Ichig Cosplay Hoodie Unisex Sweatshirt Jacket Pullover\
159
+ \ Urahar Kisuke Sweater Coat Streetwear HoodySweatshirt Hoody (X-Large, F-yellow\
160
+ \ 1) Descripion: ['Design:'\n 'Bleaches cosplay hoodie unisex Kurosak Ichig jacket\
161
+ \ pullover fashion hoody Urahar Kisuke long sleeve sweater coat adult Bleaches\
162
+ \ Kurosak Ichig sweatshirt hoodie tracksuit outerwear oversize girls boys.'\n\
163
+ \ \"Fabric: Made of high-quality polyester cotton, soft and comfortable fabric,\
164
+ \ suitable for men's daily wear. Material: Polyester. Hooded: With hat. Sleeve\
165
+ \ Length: Full sleeve, Long sleeve. Thickness: Standard. Season: Autumn Winter\
166
+ \ Spring. Style: Fashion, Creative, Funny, Casual, Hip Hop. Item Type: 2D printed\
167
+ \ hoodies sweatshirts adult. Pattern Type: Vivid 2D Print, Fashion Pattern 2D\
168
+ \ Printing. Package Includes: 1 X Anime Hoodie.\"]"
169
+ - 'Title: Cute Cat Enamel Pin I LOVE ALL THE CATS Brooch Cartoon Animal Lapel Badge
170
+ for Backpacks Jackets Clothes Bag Party Decoration Jewelry Gift for Friends Descripion:
171
+ [''Cute Cat Enamel Pin I LOVE ALL THE CATS Brooch Cartoon Animal Lapel Badge for
172
+ Backpacks Jackets Clothes Bag Party Decoration Jewelry Gift for Friends Cute Cat
173
+ Enamel Pin I LOVE ALL THE CATS Brooch Cartoon Animal Lapel Badge for Backpacks
174
+ Jackets Clothes Bag Party Decoration Jewelry Gift for Friends Cute Cat Enamel
175
+ Pin I LOVE ALL THE CATS Brooch Cartoon Animal Lapel Badge for Backpacks Jackets
176
+ Clothes Bag Party Decoration Jewelry Gift for Friends Cute Cat Enamel Pin I LOVE
177
+ ALL THE CATS Brooch Cartoon Animal Lapel Badge for Backpacks Jackets Clothes Bag
178
+ Party Decoration Jewelry Gift for Friends'']'
179
+ - 'Title: Funny Chill Demon Enamel Pin Novelty Brooch Buttons Jewelry for Jackets
180
+ Jeans Backpack Cloth Lapel Bag Hat Gift for Luci Fans Disenchantment Lovers Men
181
+ Women Boy Girl Descripion: [''-Size - About 1.2" -Hard enamel -Black shiny metal
182
+ -One rubber clutch'']'
183
+ pipeline_tag: sentence-similarity
184
+ library_name: sentence-transformers
185
+ ---
186
+
187
+ # SentenceTransformer based on sentence-transformers/all-mpnet-base-v2
188
+
189
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
190
+
191
+ ## Model Details
192
+
193
+ ### Model Description
194
+ - **Model Type:** Sentence Transformer
195
+ - **Base model:** [sentence-transformers/all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) <!-- at revision 9a3225965996d404b775526de6dbfe85d3368642 -->
196
+ - **Maximum Sequence Length:** 128 tokens
197
+ - **Output Dimensionality:** 768 tokens
198
+ - **Similarity Function:** Cosine Similarity
199
+ <!-- - **Training Dataset:** Unknown -->
200
+ <!-- - **Language:** Unknown -->
201
+ <!-- - **License:** Unknown -->
202
+
203
+ ### Model Sources
204
+
205
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
206
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
207
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
208
+
209
+ ### Full Model Architecture
210
+
211
+ ```
212
+ SentenceTransformer(
213
+ (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: MPNetModel
214
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
215
+ )
216
+ ```
217
+
218
+ ## Usage
219
+
220
+ ### Direct Usage (Sentence Transformers)
221
+
222
+ First install the Sentence Transformers library:
223
+
224
+ ```bash
225
+ pip install -U sentence-transformers
226
+ ```
227
+
228
+ Then you can load this model and run inference.
229
+ ```python
230
+ from sentence_transformers import SentenceTransformer
231
+
232
+ # Download from the 🤗 Hub
233
+ model = SentenceTransformer("knguyennguyen/mpnet_jacket4k_enhanced")
234
+ # Run inference
235
+ sentences = [
236
+ 'enamel pin with a compact size, durable material, and a secure backing.. enamel pin with a compact size, durable material, and a secure backing.',
237
+ 'Title: Funny Chill Demon Enamel Pin Novelty Brooch Buttons Jewelry for Jackets Jeans Backpack Cloth Lapel Bag Hat Gift for Luci Fans Disenchantment Lovers Men Women Boy Girl Descripion: [\'-Size - About 1.2" -Hard enamel -Black shiny metal -One rubber clutch\']',
238
+ "Title: Cute Cat Enamel Pin I LOVE ALL THE CATS Brooch Cartoon Animal Lapel Badge for Backpacks Jackets Clothes Bag Party Decoration Jewelry Gift for Friends Descripion: ['Cute Cat Enamel Pin I LOVE ALL THE CATS Brooch Cartoon Animal Lapel Badge for Backpacks Jackets Clothes Bag Party Decoration Jewelry Gift for Friends Cute Cat Enamel Pin I LOVE ALL THE CATS Brooch Cartoon Animal Lapel Badge for Backpacks Jackets Clothes Bag Party Decoration Jewelry Gift for Friends Cute Cat Enamel Pin I LOVE ALL THE CATS Brooch Cartoon Animal Lapel Badge for Backpacks Jackets Clothes Bag Party Decoration Jewelry Gift for Friends Cute Cat Enamel Pin I LOVE ALL THE CATS Brooch Cartoon Animal Lapel Badge for Backpacks Jackets Clothes Bag Party Decoration Jewelry Gift for Friends']",
239
+ ]
240
+ embeddings = model.encode(sentences)
241
+ print(embeddings.shape)
242
+ # [3, 768]
243
+
244
+ # Get the similarity scores for the embeddings
245
+ similarities = model.similarity(embeddings, embeddings)
246
+ print(similarities.shape)
247
+ # [3, 3]
248
+ ```
249
+
250
+ <!--
251
+ ### Direct Usage (Transformers)
252
+
253
+ <details><summary>Click to see the direct usage in Transformers</summary>
254
+
255
+ </details>
256
+ -->
257
+
258
+ <!--
259
+ ### Downstream Usage (Sentence Transformers)
260
+
261
+ You can finetune this model on your own dataset.
262
+
263
+ <details><summary>Click to expand</summary>
264
+
265
+ </details>
266
+ -->
267
+
268
+ <!--
269
+ ### Out-of-Scope Use
270
+
271
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
272
+ -->
273
+
274
+ <!--
275
+ ## Bias, Risks and Limitations
276
+
277
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
278
+ -->
279
+
280
+ <!--
281
+ ### Recommendations
282
+
283
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
284
+ -->
285
+
286
+ ## Training Details
287
+
288
+ ### Training Dataset
289
+
290
+ #### Unnamed Dataset
291
+
292
+
293
+ * Size: 11,397 training samples
294
+ * Columns: <code>sentence_0</code> and <code>sentence_1</code>
295
+ * Approximate statistics based on the first 1000 samples:
296
+ | | sentence_0 | sentence_1 |
297
+ |:--------|:----------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
298
+ | type | string | string |
299
+ | details | <ul><li>min: 4 tokens</li><li>mean: 28.21 tokens</li><li>max: 93 tokens</li></ul> | <ul><li>min: 30 tokens</li><li>mean: 103.65 tokens</li><li>max: 128 tokens</li></ul> |
300
+ * Samples:
301
+ | sentence_0 | sentence_1 |
302
+ |:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
303
+ | <code>cosplay jacket designed for men, made from synthetic material, featuring a closure mechanism and suitable for various festive occasions.</code> | <code>Title: Poetic Walk Kill la Kill Cosplay Matoi Ryuko Costume Jacket Baseball Coat Uniform Sports Coat Descripion: ["Anime Kill la Kill Cosplay Matoi Ryuko Costume Jacket Baseball Coat Uniform Sports Coat Package:One good quality jacket. Fabric:Polyester. Size:Mens size,please choose size from size table,if you couldn't ensure the size,please email us your measurements:female/male,height,bust,waist and hip,then we could check which size fit for you . Occasion: Halloween,Birthday, Masquerade, Christmas, Carnival,theme parties,clothing parties, costume ball, family gatherings, Halloween Party .Cosplay and all kinds of seasonal holidays and parties ."]</code> |
304
+ | <code>a collarless leather jacket for stylish outerwear</code> | <code>Title: Cole Haan Women's Leather Collarless Jacket Descripion: ['Collarless smooth lamb leather jacket with exposed snap detail at necline.']</code> |
305
+ | <code>jacket featuring a flexible closure, adjustable head covering, and secure storage options.. jacket featuring a flexible closure, adjustable head covering, and secure storage options.</code> | <code>Title: PUMA Puma X Helly Hansen Jacket Descripion: ['Equip Your Wardrobe With The Latest Styles And Technology From This Duo Of Sportswear Titans, Puma And Helly Hansen. Known For Their Excellence With Outerwear, Puma Has Teamed Up With The Experts Over At Helly Hansen To Produce High-Performance, High Style Options For This Line Of Winterwear.']</code> |
306
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
307
+ ```json
308
+ {
309
+ "scale": 20.0,
310
+ "similarity_fct": "cos_sim"
311
+ }
312
+ ```
313
+
314
+ ### Training Hyperparameters
315
+ #### Non-Default Hyperparameters
316
+
317
+ - `per_device_train_batch_size`: 128
318
+ - `per_device_eval_batch_size`: 128
319
+ - `num_train_epochs`: 5
320
+ - `multi_dataset_batch_sampler`: round_robin
321
+
322
+ #### All Hyperparameters
323
+ <details><summary>Click to expand</summary>
324
+
325
+ - `overwrite_output_dir`: False
326
+ - `do_predict`: False
327
+ - `eval_strategy`: no
328
+ - `prediction_loss_only`: True
329
+ - `per_device_train_batch_size`: 128
330
+ - `per_device_eval_batch_size`: 128
331
+ - `per_gpu_train_batch_size`: None
332
+ - `per_gpu_eval_batch_size`: None
333
+ - `gradient_accumulation_steps`: 1
334
+ - `eval_accumulation_steps`: None
335
+ - `torch_empty_cache_steps`: None
336
+ - `learning_rate`: 5e-05
337
+ - `weight_decay`: 0.0
338
+ - `adam_beta1`: 0.9
339
+ - `adam_beta2`: 0.999
340
+ - `adam_epsilon`: 1e-08
341
+ - `max_grad_norm`: 1
342
+ - `num_train_epochs`: 5
343
+ - `max_steps`: -1
344
+ - `lr_scheduler_type`: linear
345
+ - `lr_scheduler_kwargs`: {}
346
+ - `warmup_ratio`: 0.0
347
+ - `warmup_steps`: 0
348
+ - `log_level`: passive
349
+ - `log_level_replica`: warning
350
+ - `log_on_each_node`: True
351
+ - `logging_nan_inf_filter`: True
352
+ - `save_safetensors`: True
353
+ - `save_on_each_node`: False
354
+ - `save_only_model`: False
355
+ - `restore_callback_states_from_checkpoint`: False
356
+ - `no_cuda`: False
357
+ - `use_cpu`: False
358
+ - `use_mps_device`: False
359
+ - `seed`: 42
360
+ - `data_seed`: None
361
+ - `jit_mode_eval`: False
362
+ - `use_ipex`: False
363
+ - `bf16`: False
364
+ - `fp16`: False
365
+ - `fp16_opt_level`: O1
366
+ - `half_precision_backend`: auto
367
+ - `bf16_full_eval`: False
368
+ - `fp16_full_eval`: False
369
+ - `tf32`: None
370
+ - `local_rank`: 0
371
+ - `ddp_backend`: None
372
+ - `tpu_num_cores`: None
373
+ - `tpu_metrics_debug`: False
374
+ - `debug`: []
375
+ - `dataloader_drop_last`: False
376
+ - `dataloader_num_workers`: 0
377
+ - `dataloader_prefetch_factor`: None
378
+ - `past_index`: -1
379
+ - `disable_tqdm`: False
380
+ - `remove_unused_columns`: True
381
+ - `label_names`: None
382
+ - `load_best_model_at_end`: False
383
+ - `ignore_data_skip`: False
384
+ - `fsdp`: []
385
+ - `fsdp_min_num_params`: 0
386
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
387
+ - `fsdp_transformer_layer_cls_to_wrap`: None
388
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
389
+ - `deepspeed`: None
390
+ - `label_smoothing_factor`: 0.0
391
+ - `optim`: adamw_torch
392
+ - `optim_args`: None
393
+ - `adafactor`: False
394
+ - `group_by_length`: False
395
+ - `length_column_name`: length
396
+ - `ddp_find_unused_parameters`: None
397
+ - `ddp_bucket_cap_mb`: None
398
+ - `ddp_broadcast_buffers`: False
399
+ - `dataloader_pin_memory`: True
400
+ - `dataloader_persistent_workers`: False
401
+ - `skip_memory_metrics`: True
402
+ - `use_legacy_prediction_loop`: False
403
+ - `push_to_hub`: False
404
+ - `resume_from_checkpoint`: None
405
+ - `hub_model_id`: None
406
+ - `hub_strategy`: every_save
407
+ - `hub_private_repo`: False
408
+ - `hub_always_push`: False
409
+ - `gradient_checkpointing`: False
410
+ - `gradient_checkpointing_kwargs`: None
411
+ - `include_inputs_for_metrics`: False
412
+ - `eval_do_concat_batches`: True
413
+ - `fp16_backend`: auto
414
+ - `push_to_hub_model_id`: None
415
+ - `push_to_hub_organization`: None
416
+ - `mp_parameters`:
417
+ - `auto_find_batch_size`: False
418
+ - `full_determinism`: False
419
+ - `torchdynamo`: None
420
+ - `ray_scope`: last
421
+ - `ddp_timeout`: 1800
422
+ - `torch_compile`: False
423
+ - `torch_compile_backend`: None
424
+ - `torch_compile_mode`: None
425
+ - `dispatch_batches`: None
426
+ - `split_batches`: None
427
+ - `include_tokens_per_second`: False
428
+ - `include_num_input_tokens_seen`: False
429
+ - `neftune_noise_alpha`: None
430
+ - `optim_target_modules`: None
431
+ - `batch_eval_metrics`: False
432
+ - `eval_on_start`: False
433
+ - `use_liger_kernel`: False
434
+ - `eval_use_gather_object`: False
435
+ - `batch_sampler`: batch_sampler
436
+ - `multi_dataset_batch_sampler`: round_robin
437
+
438
+ </details>
439
+
440
+ ### Framework Versions
441
+ - Python: 3.11.11
442
+ - Sentence Transformers: 3.1.1
443
+ - Transformers: 4.45.2
444
+ - PyTorch: 2.5.1+cu121
445
+ - Accelerate: 1.2.1
446
+ - Datasets: 3.2.0
447
+ - Tokenizers: 0.20.3
448
+
449
+ ## Citation
450
+
451
+ ### BibTeX
452
+
453
+ #### Sentence Transformers
454
+ ```bibtex
455
+ @inproceedings{reimers-2019-sentence-bert,
456
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
457
+ author = "Reimers, Nils and Gurevych, Iryna",
458
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
459
+ month = "11",
460
+ year = "2019",
461
+ publisher = "Association for Computational Linguistics",
462
+ url = "https://arxiv.org/abs/1908.10084",
463
+ }
464
+ ```
465
+
466
+ #### MultipleNegativesRankingLoss
467
+ ```bibtex
468
+ @misc{henderson2017efficient,
469
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
470
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
471
+ year={2017},
472
+ eprint={1705.00652},
473
+ archivePrefix={arXiv},
474
+ primaryClass={cs.CL}
475
+ }
476
+ ```
477
+
478
+ <!--
479
+ ## Glossary
480
+
481
+ *Clearly define terms in order to be accessible across audiences.*
482
+ -->
483
+
484
+ <!--
485
+ ## Model Card Authors
486
+
487
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
488
+ -->
489
+
490
+ <!--
491
+ ## Model Card Contact
492
+
493
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
494
+ -->
config.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "sentence-transformers/all-mpnet-base-v2",
3
+ "architectures": [
4
+ "MPNetModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "bos_token_id": 0,
8
+ "eos_token_id": 2,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 3072,
14
+ "layer_norm_eps": 1e-05,
15
+ "max_position_embeddings": 514,
16
+ "model_type": "mpnet",
17
+ "num_attention_heads": 12,
18
+ "num_hidden_layers": 12,
19
+ "pad_token_id": 1,
20
+ "relative_attention_num_buckets": 32,
21
+ "torch_dtype": "float32",
22
+ "transformers_version": "4.45.2",
23
+ "vocab_size": 30527
24
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.1.1",
4
+ "transformers": "4.45.2",
5
+ "pytorch": "2.5.1+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": null
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:427eb630a50c97c291097126f32f55c752bdb53a61f403af6f8dff0bec6df474
3
+ size 437967672
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 128,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "cls_token": {
10
+ "content": "<s>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "eos_token": {
17
+ "content": "</s>",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "mask_token": {
24
+ "content": "<mask>",
25
+ "lstrip": true,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "pad_token": {
31
+ "content": "<pad>",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ },
37
+ "sep_token": {
38
+ "content": "</s>",
39
+ "lstrip": false,
40
+ "normalized": false,
41
+ "rstrip": false,
42
+ "single_word": false
43
+ },
44
+ "unk_token": {
45
+ "content": "[UNK]",
46
+ "lstrip": false,
47
+ "normalized": false,
48
+ "rstrip": false,
49
+ "single_word": false
50
+ }
51
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,72 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "<s>",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "1": {
12
+ "content": "<pad>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "2": {
20
+ "content": "</s>",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "3": {
28
+ "content": "<unk>",
29
+ "lstrip": false,
30
+ "normalized": true,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "104": {
36
+ "content": "[UNK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ },
43
+ "30526": {
44
+ "content": "<mask>",
45
+ "lstrip": true,
46
+ "normalized": false,
47
+ "rstrip": false,
48
+ "single_word": false,
49
+ "special": true
50
+ }
51
+ },
52
+ "bos_token": "<s>",
53
+ "clean_up_tokenization_spaces": false,
54
+ "cls_token": "<s>",
55
+ "do_lower_case": true,
56
+ "eos_token": "</s>",
57
+ "mask_token": "<mask>",
58
+ "max_length": 128,
59
+ "model_max_length": 128,
60
+ "pad_to_multiple_of": null,
61
+ "pad_token": "<pad>",
62
+ "pad_token_type_id": 0,
63
+ "padding_side": "right",
64
+ "sep_token": "</s>",
65
+ "stride": 0,
66
+ "strip_accents": null,
67
+ "tokenize_chinese_chars": true,
68
+ "tokenizer_class": "MPNetTokenizer",
69
+ "truncation_side": "right",
70
+ "truncation_strategy": "longest_first",
71
+ "unk_token": "[UNK]"
72
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff