SentenceTransformer based on sentence-transformers/all-mpnet-base-v2
This is a sentence-transformers model finetuned from sentence-transformers/all-mpnet-base-v2. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: sentence-transformers/all-mpnet-base-v2
- Maximum Sequence Length: 128 tokens
- Output Dimensionality: 768 tokens
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: MPNetModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("knguyennguyen/mpnet_laptopjacke")
# Run inference
sentences = [
"girls' winter jacket with a waterproof outer layer, a warm inner fleece, and adjustable features for comfort and fit.. girls' winter jacket with a waterproof outer layer, a warm inner fleece, and adjustable features for comfort and fit.",
"Title: Columbia Girls' Bugaboo Ii Fleece Interchange Jacket Descripion: ['Finding an all-inclusive girls winter jacket that can be used in different weather conditions can be a challenge. Fortunately, our Bugaboo II Fleece Interchange Winter Jacket is the perfect multiple-use all-weather coat – utilizing our classic three-in-one design. It features a waterproof and breathable outer shell and an inner fleece layer that can be worn separately or zipped together for extra protection against wet and cold weather. The waterproof outer shell features an inner layer of our thermal heat reflective secret sauce we call Omni-HEAT. The engineered silver dots are designed to capture natural body heat, reducing weight while increasing comfort. The warm inner fleece jacket can be worn separately or zipped into the waterproof and breathable outer layer. This three-in-one jacket system features zippered hand pockets and adjustable cuffs, a taffeta lined removable storm hood, media and goggle pocket, fleece lined zippered hand pockets, and adjustable cuffs for added warmth control in evolving weather conditions. Available in a range of colors and youth sizes.']",
'Title: ANOKA Yellowstone Jacket for Women Green XXL Descripion: [\'size details\'\n "Size:S----US:6----Bust:106cm/41.73\'\'----Sleeve:69cm/27.17\'\'----Length:71cm/27.95\'\' Size:M----US:8----Bust:110cm/43.31\'\'----Sleeve:70cm/27.56\'\'----Length:72cm/28.35\'\' Size:L----US:10----Bust:116cm/45.67\'\'----Sleeve:71cm/27.95\'\'----Length:73cm/28.74\'\' Size:XL----US:12----Bust:122cm/48.03\'\'----Sleeve:72cm/28.35\'\'----Length:74cm/29.13\'\' Size:XXL----US:14----Bust:128cm/50.39\'\'----Sleeve:73cm/28.74\'\'----Length:75cm/29.53\'\' Size:XXXL----US:16----Bust:134cm/52.76\'\'----Sleeve:74cm/29.13\'\'----Length:76cm/29.92\'\' Size:XXXXL----US:18----Bust:140cm/55.12\'\'----Sleeve:75cm/29.53\'\'----Length:77cm/30.31\'\' Size:XXXXXL----US:20----Bust:146cm/57.48\'\'----Sleeve:76cm/29.92\'\'----Length:78cm/30.71\'\'"]',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Training Details
Training Dataset
Unnamed Dataset
- Size: 15,123 training samples
- Columns:
sentence_0
andsentence_1
- Approximate statistics based on the first 1000 samples:
sentence_0 sentence_1 type string string details - min: 5 tokens
- mean: 27.46 tokens
- max: 101 tokens
- min: 30 tokens
- mean: 110.08 tokens
- max: 128 tokens
- Samples:
sentence_0 sentence_1 boys' winter jacket with a water-resistant exterior, thermal insulation, and an adjustable hood.
Title: Columbia Boys' Powder Lite Hooded Winter Jacket Descripion: ["Our Powder Lite cold-weather jacket combines a classic fit with technology to keep boys warm and dry. Crafted of a water resistant shell, lined with Omni-HEAT reflective system, and packed with our Thermarator insulation, this coat will keep kids warm and comfortable when the weather turns blustery and cold… ready to help them take winter by storm. \u2028\u2028Complete with hood and a soft chin guard, zipped hand pockets to keep items secure, while a draw cord adjustable hem keeps the cold locked out, making for the perfect fit for active youths. This boy's winter jacket is available in many accommodating colors and boy sizes. To ensure the size you choose is right, utilize our sizing chart and the following measurement instructions: For the sleeves, start at the center back of your neck and measure across the shoulder and down to the sleeve. If you come up with a partial number, round up to the next even number. For the chest, measure at the fullest part of the chest, under the armpits and over the shoulder blades, keeping the tape measure firm and level.\u2028 Imported. \u2028Made from 100% polyester. \u2028Zippered closure. \u2028Machine Wash."]
laptop with a large display, efficient processor, ample memory, and multiple connectivity options.
Title: ASUS New VivoBook 15 15.6 Inch FHD 1080P Laptop (AMD Ryzen 3 3250U up to 3.5GHz, 12GB DDR4 RAM, 256GB SSD, AMD Radeon Vega 3, WiFi, Bluetooth, HDMI, Windows 10) (Grey) Descripion: ['XM sells computers with upgraded configurations. If the computer has modifications (listed above), then the manufacturer box is opened for it to be tested and inspected and to install the upgrades to achieve the specifications as advertised. Operating System: Windows 10 Home 64-bitDisplay: 15.6 inch FHD(1920 x 1080) with four-sided wider NanoEdge bezel displayProcessor: AMD Ryzen 3 3250U Processor (2.6 GHz base frequency up to 3.5 GHz, 2 Cores, 1MB Cache)Memory: Up to 16GB DDR4 RAMHard Drive: Up to 1TB SSDGraphics: AMD Radeon Vega 3Wireless: 802.11ac, Bluetooth 4.1Webcam: YESAudio features: Stereo speakersPorts: 1 x COMBO audio jack1 x Type-A USB 3.0 (USB 3.1 Gen 1)1 x Type-C USB 3.0 (USB 3.1 Gen 1)2 x USB 2.0 port(s)1 x HDMIBattery Type: 2 -Cell 37 Wh BatteryWeight: 3.75lbsDimensions: 14.4 x 9.1 x 0.8 inchesColor: Gray']
men's parka with a hood, featuring a waterproof design, secure storage options, and insulation for warmth.. men's parka with a hood, featuring a waterproof design, secure storage options, and insulation for warmth.
Title: Under Armour Men's Unstoppable Waterproof Hooded Down Parka Project Rock Long STORM Jacket Descripion: ['Under Armour Project Rock Hooded Down Parka Jacket Black 1346093-001 Storm technology: breathable and waterproof, Secure pockets. Zip/snap closure. Down fill.']
- Loss:
MultipleNegativesRankingLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Training Hyperparameters
Non-Default Hyperparameters
per_device_train_batch_size
: 128per_device_eval_batch_size
: 128num_train_epochs
: 5multi_dataset_batch_sampler
: round_robin
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: noprediction_loss_only
: Trueper_device_train_batch_size
: 128per_device_eval_batch_size
: 128per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 5e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1num_train_epochs
: 5max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.0warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Falsehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseeval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falsebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: round_robin
Training Logs
Epoch | Step | Training Loss |
---|---|---|
4.2017 | 500 | 1.9644 |
Framework Versions
- Python: 3.11.11
- Sentence Transformers: 3.1.1
- Transformers: 4.45.2
- PyTorch: 2.5.1+cu121
- Accelerate: 1.2.1
- Datasets: 3.2.0
- Tokenizers: 0.20.3
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
- Downloads last month
- 3
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for knguyennguyen/mpnet_laptopjacke
Base model
sentence-transformers/all-mpnet-base-v2