MPNet base trained on AllNLI triplets
This is a sentence-transformers model finetuned from microsoft/mpnet-base. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: microsoft/mpnet-base
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Cosine Similarity
- Language: en
- License: apache-2.0
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: MPNetModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("Jrinky/mpnet-base-all-nli-triplet")
# Run inference
sentences = [
'What challenges do university researchers face when trying to turn their discoveries into commercial products',
'Universities are vital to the process of innovation and advancement: they educate students who bring new ways of thinking to old problems, and they make new discoveries that no one else would make because no one else has the opportunity to delve so deeply. In creating this type of refuge, we also create a comfort zone. Because governmental support for science and technology is designed to support long-term, high-risk work regardless of immediate return, ROI is not a factor in getting government funding. University researchers become successful at pitching research ideas without serious reference to commercial outcome. Peer review – which is critical for the success of science – further reinforces this tendency. University researchers are rewarded for thinking in this very specific way, and this creates the comfort zone. As it dawns on a researcher that they may need to work with a company or an entrepreneur to see their discoveries become products or services that can benefit society, they may find themselves a victim of their own past success. Many researchers reflexively approach companies as if they are yet another type of funding agency, but since companies are not in the grant-making business, a partnership fails to materialize. This basic failure to communicate means valuable commercial opportunities are often not recognized, or when they are, the resulting partnership does not go well.',
'A major shakeup has taken place at the top of the Boston Celtics. Danny Ainge has stepped down as president of basketball operations, and head coach Brad Stevens has stepped into the role. Stevens will now lead the search for a new coach. The team made the announcement early Wednesday, one day after the Celtics were eliminated by the Brooklyn Nets in the first round of the Eastern Conference playoffs. “Helping guide this organization has been the thrill of a lifetime, and having worked side-by-side with him since he’s been here, I know we couldn’t be in better hands than with Brad guiding the team going forward,” Ainge said in a statement. “I’m grateful to ownership, all of my Celtics colleagues, and the best fans in basketball for being part of the journey.”\nAinge, 62, is a franchise legend.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Training Details
Training Dataset
Unnamed Dataset
- Size: 6,433 training samples
- Columns:
anchor
andpositive
- Approximate statistics based on the first 1000 samples:
anchor positive type string string details - min: 6 tokens
- mean: 16.21 tokens
- max: 42 tokens
- min: 5 tokens
- mean: 140.69 tokens
- max: 512 tokens
- Samples:
anchor positive What type of event is being described by Pierre LeBrun in relation to the NHL
ESPN’s Pierre LeBrun said, “It's not just about one NHL game anymore. It's a week-long event.
Who designed the property's landscape and when was the building listed on the National Register of Historic Places
The property's landscape continues a circular theme, with flower beds, fencing, and parking arranged in concentric patterns around the structure. It was designed by the Washington, DC firm of Deigert & Yerkes. The building was listed on the National Register of Historic Places in 2017.
Is 'ladens' a valid word to use in Scrabble and other word games
Scrabble?! LADENSIs ladens valid for Scrabble? Words With Friends? Lexulous? WordFeud? Other games
- Loss:
selfloss.Infonce
with these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Evaluation Dataset
Unnamed Dataset
- Size: 804 evaluation samples
- Columns:
anchor
andpositive
- Approximate statistics based on the first 804 samples:
anchor positive type string string details - min: 7 tokens
- mean: 16.44 tokens
- max: 38 tokens
- min: 8 tokens
- mean: 149.21 tokens
- max: 512 tokens
- Samples:
anchor positive What types of special events can the salon services be booked for
Our fabulous salon services are available at your special event! Whether it's a wedding, photo shoot, prom, or just a fun girls' night in- we do it all.
What material is the Hudson Baby plush hooded robe made of
Dimensions (Overall): 10 inches (L), 10 inches (H) x 1 inches (W)
Weight: 1 pounds
Textile Material: 100% Polyester
• Animal face plush hooded bath robe. • Made with 100% plush coral fleece fabric
• Soft and gentle on baby's skin
• Optimal for everyday use
• Affordable, high quality bath robe
Hudson Baby plush hooded robe is made of super soft, cozy plush material to dry and warm baby after bath or pool time.Where is this uncommon species thought to occur
It is also thought to occur in New Zealand. It is an uncommon species, growing in "heathy woodland [in] semi shade".
- Loss:
selfloss.Infonce
with these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 32per_device_eval_batch_size
: 32learning_rate
: 2e-05num_train_epochs
: 6warmup_ratio
: 0.1fp16
: Truebatch_sampler
: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 32per_device_eval_batch_size
: 32per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 2e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 6max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Truefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Nonehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: no_duplicatesmulti_dataset_batch_sampler
: proportional
Training Logs
Epoch | Step | Training Loss | Validation Loss |
---|---|---|---|
0.9901 | 100 | 1.4311 | 0.2171 |
1.9802 | 200 | 0.237 | 0.1718 |
2.9703 | 300 | 0.1466 | 0.1561 |
3.9604 | 400 | 0.1084 | 0.1541 |
4.9505 | 500 | 0.0879 | 0.1528 |
5.9406 | 600 | 0.0794 | 0.1514 |
Framework Versions
- Python: 3.10.12
- Sentence Transformers: 3.4.0
- Transformers: 4.48.1
- PyTorch: 2.5.1+cu124
- Accelerate: 1.3.0
- Datasets: 3.2.0
- Tokenizers: 0.21.0
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
Infonce
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
- Downloads last month
- 6
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
Model tree for Jrinky/model1
Base model
microsoft/mpnet-base