SentenceTransformer based on sentence-transformers/all-mpnet-base-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-mpnet-base-v2. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/all-mpnet-base-v2
  • Maximum Sequence Length: 128 tokens
  • Output Dimensionality: 768 tokens
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: MPNetModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("knguyennguyen/mpnet_laptop1k_enhanced")
# Run inference
sentences = [
    'laptop computer with a large display, integrated graphics, and a sleek design.. laptop computer with a large display, integrated graphics, and a sleek design.',
    'Title: HP 17 17.3" HD+ Laptop Computer, Intel Quad-Core i7-1165G7 up to 4.7GHz, 8GB DDR4 RAM, 512GB PCIe SSD, 802.11AC WiFi, Bluetooth 5.0, Natural Silver, Windows 11 Home, BROAG Extension Cable Descripion: [\'Microprocessor\'\n \'Intel Core i7-1165G7 (up to 4.7 GHz with Intel Turbo Boost Technology, 12 MB L3 cache, 4 cores, 8 threads)\'\n \'Chipset\' \'Intel Integrated SoC\' \'Memory, standard\'\n \'8GB DDR4-3200 MHz RAM\' \'Video graphics\' \'Intel Iris Xᵉ Graphics\'\n \'Hard drive\' \'512GB PCIe NVMe M.2 SSD\' \'Optical drive\'\n \'Optical drive not included\' \'Display\'\n \'17.3" diagonal, HD+ (1600 x 900), BrightView, 250 nits\'\n \'Wireless connectivity\' \'Realtek Wi-Fi 5 (2x2) and Bluetooth 5 Combo\'\n \'Expansion slots\' \'1 multi-format SD media card reader\' \'External ports\'\n \'1 SuperSpeed USB Type-C 5Gbps signaling rate; 2 SuperSpeed USB Type-A 5Gbps signaling rate; 1 HDMI 1.4b; 1 AC smart pin; 1 headphone/microphone combo\'\n \'Minimum dimensions (W x D x H)\'\n \'15.78" x 10.15" x 0.78" (40.07 x 25.78 x 2.06 cm)\' \'Weight\' \'5.25\'\n \'Power supply type\' \'45 W Smart AC power adapter\' \'Battery type\'\n \'3-cell, 41 Wh Li-ion\' \'Webcam\'\n \'HP True Vision 720p HD camera with integrated dual array digital microphones\'\n \'Audio\' \'Dual speakers\' \'Keyboard\'\n \'Full-size island-style natural silver keyboard with numeric keypad\'\n \'Operating system\' \'Windows 11 Home\']',
    'Title: HP 17 Flagship Laptop Computer 17.3" FHD IPS Anti-Glare Display 11th Gen Intel 4-Core i5-1135G7 (Beats i7-10510U) 16GB RAM 256GB SSD Intel Iris Xe Graphics Webcam Win10 Pro Silver + HDMI Cable Descripion: [\'OVERVIEW:\'\n \'Responsive and reliable performance: Surf, stream, and do so much more with a powerful Intel Core processor. Plus, extensive quality testing ensures that your laptop keeps going and going.\'\n \'Product Details:\' \'Microprocessor:\'\n \'11th Gen Intel 4-Core i5-1135G7 (Max Boost Clock Up to 4.2GHz, 8MB Smart Cache, 8 Threads)\'\n \'Memory:\' \'16GB RAM\' \'Storage:\' \'256GB SSD\' \'Operating system:\'\n \'Microsoft Windows 10 Professional\' \'Graphics & Video:\'\n \'17.3" FHD (1920 x 1080) IPS Anti-Glare Display Integrated Intel Iris Xe Graphics\'\n \'Key Features:\'\n \'Bluetooth: Yes Optical Drive: No Webcam: Yes Backlit Keyboard: No Fingerprint Reader: No HD Audio: Yes Multi-format SD media card reader: Yes\'\n \'Ports :\'\n \'1x HDMI, 2x USB-A 3.0, 1x USB-A 2.0, 1x RJ-45 Ethernet, 1x Headphone/microphone combo, 1x Multi-format SD media card reader\'\n \'Battery:\' \'Up to 6.5 hours battery life\' \'Additional Information:\'\n \'Dimension: 16.33" x 10.72" x 0.96" Weight: 5.25 lbs\' \'Accessory:\'\n \'HDMI Cable\']',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 3,726 training samples
  • Columns: sentence_0 and sentence_1
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1
    type string string
    details
    • min: 5 tokens
    • mean: 26.11 tokens
    • max: 128 tokens
    • min: 51 tokens
    • mean: 124.97 tokens
    • max: 128 tokens
  • Samples:
    sentence_0 sentence_1
    gaming notebook featuring a large display, advanced cooling system, and a sleek design for an immersive experience.. gaming notebook featuring a large display, advanced cooling system, and a sleek design for an immersive experience. Title: HP Pavilion 15 15.6" Gaming Laptop Intel Core i5 8GB RAM 256GB SSD GTX 1050 3GB - 9th Gen i5-9300H Quad-core - NVIDIA GeForce GTX 1050 3GB GDDR5 - Intel UHD Graphics 630 - in-Plane Switching (IPS Descripion: ['Sacrifice nothing with the thin and powerful HP Pavilion Gaming Laptop. Experience high-grade graphics and processing power for gaming and multitasking, plus improved thermal cooling for overall performance and stability. Immerse yourself in the game with a narrow bezel display and custom-tuned audio. The perfect balance between work and play, so you can do it all. Windows 10 Home or other operating systems available Power your day. Power your play. Take on anything and everything with a 9th generation Intel® Core™ processor and NVIDIA® GeForce® graphics. A high resolution display with fast refresh rate delivers smooth gameplay visuals, while also bringing entertainment and content to life. Game harder for longer The HP Pavilion Gaming Laptop is equipped with a dual fan system for enhanced thermal cooling. Wide rear corner vents and additional air inlets maximize airflow to optimize your overall performance and stability, keeping the machine cool during extended usage. Bold, immersive design Get lost in the game. A sleek micro-edge bezel display provides a maximum viewing experience while the front-firing speakers with AUDIO by Bang & Olufsen deliver powerful, custom-tuned sound. Manufacturer: HP Inc. Manufacturer Part Number: 7MP87UA#ABA. Brand Name: HP. Product Line: Pavilion Gaming. Product Series: 15-dk0000. Product Model: 15-dk0068wm. Product Name: Gaming Pavilion - 15-dk0068wm. Product Type: Gaming Notebook. [Processor & Chipset] Processor Manufacturer: Intel. Processor Type: Core i5. Processor Generation: 9th Gen. Processor Model: i5-9300H. Processor Speed: 2.40 GHz. Maximum Turbo Speed: 4.10 GHz. Processor Core: Quad-core (4 Core). [Memory] Standard Memory: 8 GB. Memory Technology: DDR4 SDRAM. Number of Occupied Memory Slots: 1. [Storage] Drive Type: SSD. Total Solid State Drive Capacity: 256 GB. Optical Drive Ty']
    gaming laptop featuring a large display, high-performance processor, ample memory, and multiple connectivity options. Title: GIGABYTE A7 X1-17.3" FHD IPS Anti-Glare 144Hz - AMD Ryzen 9 5900HX - NVIDIA GeForce RTX 3070 Laptop GPU 8 GB GDDR6-16 GB Memory - 512 GB PCIe SSD - Windows 10 Home-Gaming Laptop (A7 X1-CUS1130SH) Descripion: ['NVIDIA GeForce RTX 30 Series Laptop GPUsBoost Clock 1560 MHz, Maximum Graphics Power 140 W ▪ AMD Ryzen 9 5900HX Mobile Processor ▪ 16 GB RAM, 512 GB PCIe SSD ▪ 17.3" Thin Bezel FHD 1920x1080 IPS-level ▪ Support 3 Slots of Storage System ▪ Intel Wi-Fi 6 AX200 Wireless Network Card ▪ LAN: RTL8125-BG REALTEK (2.5G) Ethernet ▪ All-New GAMING CENTER Software ▪ NAHIMIC 3D Audio for Gamers, Windows 10 Home ▪ All-zone of Single Colored Backlit Keyboard with 15 Colors LED Setting ▪ 1 \u200ex USB 2.0 (Type-A), 1 x USB 3.2 Gen1 (Type-A), 1 x USB 3.2 Gen2 (Type-A) ▪ 1 x HDMI 2.0 (with HDCP), Audio Combo Jack, 1x DC-in Jack ▪ 1x mini DP 1.4, 1x DisplayPort 1.4 (Type-C) over USB 3.2 Gen 2']
    14-inch laptop with a durable design, integrated graphics, and multiple connectivity options.. 14-inch laptop with a durable design, integrated graphics, and multiple connectivity options. Title: Dell Latitude E6410 ATG 14 Inch Laptop PC, Intel Core i5-520M up to 2.93GHz, 4G DDR3, 320G, DVD, WiFi, VGA, DP, Windows 10 Pro 64 Bit Multi-Language Support English/French/Spanish(Renewed) Descripion: ['The ATG should satisfy users who are after a well-performing, semi-rugged notebook, but more demanding users who are looking for a unit to use in a very dusty area might want something that has dust-protected ports and slots as well.'
    'Specifications:'
    'Processor: Intel Core i5-520M up to 2.93GHz Graphics: Intel HD Integrated Graphics Memory: 4GB DDR3 Hard Drive: 320G'
    'Operating System:' 'Windows 10 Pro 64 Bit Multi-Language.' 'Ports:'
    '3 x USB 2.0, VGA, LAN, Headphone output, Microphone input, 4-pin FireWire, DisplayPort, Dock, eSATA.'
    'Warranty' '1 full year Parts and Labor Warranty' 'Included in the box'
    'Computer, Power Supply, Warranty Instruction.']
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 128
  • per_device_eval_batch_size: 128
  • num_train_epochs: 5
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 128
  • per_device_eval_batch_size: 128
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 5
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Framework Versions

  • Python: 3.11.11
  • Sentence Transformers: 3.1.1
  • Transformers: 4.45.2
  • PyTorch: 2.5.1+cu121
  • Accelerate: 1.2.1
  • Datasets: 3.2.0
  • Tokenizers: 0.20.3

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
7
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for knguyennguyen/mpnet_laptop1k_enhanced

Finetuned
(206)
this model