--- tags: - sentence-transformers - sentence-similarity - feature-extraction - generated_from_trainer - dataset_size:473546 - loss:MultipleNegativesRankingLoss base_model: NeuML/pubmedbert-base-embeddings widget: - source_sentence: Lantus sentences: - Corrosion of first degree of unspecified hand, unspecified site, subsequent encounter - Anencephaly - Type 2 diabetes mellitus with diabetic peripheral angiopathy without gangrene - Type 2 diabetes mellitus with diabetic cataract - Type 2 diabetes mellitus with diabetic autonomic (poly)neuropathy - Crushed by nonvenomous snake, initial encounter - Type 2 diabetes mellitus with diabetic mononeuropathy - Type 2 diabetes mellitus with diabetic chronic kidney disease - Encounter for attention to other artificial openings - Fracture of base of skull, left side, subsequent encounter for fracture with delayed healing - Type 2 diabetes mellitus with ketoacidosis without coma - source_sentence: Follicular thyroid carcinoma sentences: - Unspecified fracture of lower end of unspecified ulna, subsequent encounter for open fracture type I or II with nonunion - Neoplasm of unspecified behavior of digestive system - Unspecified fracture of T9-T10 vertebra, subsequent encounter for fracture with nonunion - Other benign neuroendocrine tumors - Malignant poorly differentiated neuroendocrine tumors - Malignant neoplasm of pyriform sinus - Malignant neoplasm of trachea - Poisoning by iminostilbenes, assault, sequela - Stress fracture, unspecified foot, subsequent encounter for fracture with delayed healing - Adverse effect of other parasympathomimetics [cholinergics], initial encounter - Malignant neoplasm of thyroid gland - source_sentence: Cardiac ischemia sentences: - Displaced fracture of middle phalanx of other finger, subsequent encounter for fracture with delayed healing - Unspecified displaced fracture of surgical neck of left humerus, subsequent encounter for fracture with routine healing - Nondisplaced Maisonneuve's fracture of left leg, subsequent encounter for open fracture type I or II with routine healing - Partial traumatic amputation at right shoulder joint, initial encounter - Toxic effect of unspecified noxious substance eaten as food, undetermined, initial encounter - Corrosion of third degree of left toe(s) (nail), sequela - Atherosclerotic heart disease of native coronary artery with unstable angina pectoris - Other specified injury of axillary artery, left side, sequela - Hemiplegia and hemiparesis following nontraumatic subarachnoid hemorrhage affecting left dominant side - Displaced transverse fracture of shaft of unspecified femur - Partial traumatic transmetacarpal amputation of unspecified hand, sequela - source_sentence: Intrauterine fetal death sentences: - Dislocation of other parts of lumbar spine and pelvis, sequela - Poisoning by cardiac-stimulant glycosides and drugs of similar action, intentional self-harm, sequela - Insect bite (nonvenomous) of unspecified finger, sequela - Other diseases of the blood and blood-forming organs and certain disorders involving the immune mechanism complicating the puerperium - Other specified diseases and conditions complicating pregnancy, childbirth and the puerperium - Diseases of the respiratory system complicating childbirth - Diseases of the circulatory system complicating childbirth - Diseases of the skin and subcutaneous tissue complicating childbirth - War operations involving other forms of conventional warfare, civilian, sequela - External constriction of vagina and vulva - Anemia complicating childbirth - source_sentence: CAD sentences: - Dislocation of C6/C7 cervical vertebrae, subsequent encounter - Unspecified injury of extensor muscle, fascia and tendon of left little finger at wrist and hand level, subsequent encounter - Other fracture of lower end of left tibia, subsequent encounter for closed fracture with malunion - Other fracture of upper end of unspecified radius, subsequent encounter for closed fracture with delayed healing - Poisoning by monoamine-oxidase-inhibitor antidepressants, undetermined, subsequent encounter - Atherosclerotic heart disease of native coronary artery with unspecified angina pectoris - Sprain of anterior cruciate ligament of right knee, initial encounter - Myopia, bilateral - Velamentous insertion of umbilical cord, first trimester - Iliofemoral ligament sprain of left hip, subsequent encounter - Air embolism (traumatic), sequela datasets: - FrancescoBuda/mimic10-hard-negatives pipeline_tag: sentence-similarity library_name: sentence-transformers --- # SentenceTransformer based on NeuML/pubmedbert-base-embeddings This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [NeuML/pubmedbert-base-embeddings](https://huggingface.co/NeuML/pubmedbert-base-embeddings) on the [mimic10-hard-negatives](https://huggingface.co/datasets/FrancescoBuda/mimic10-hard-negatives) dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more. ## Model Details ### Model Description - **Model Type:** Sentence Transformer - **Base model:** [NeuML/pubmedbert-base-embeddings](https://huggingface.co/NeuML/pubmedbert-base-embeddings) - **Maximum Sequence Length:** 64 tokens - **Output Dimensionality:** 768 tokens - **Similarity Function:** Cosine Similarity - **Training Dataset:** - [mimic10-hard-negatives](https://huggingface.co/datasets/FrancescoBuda/mimic10-hard-negatives) ### Model Sources - **Documentation:** [Sentence Transformers Documentation](https://sbert.net) - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers) - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers) ### Full Model Architecture ``` SentenceTransformer( (0): Transformer({'max_seq_length': 64, 'do_lower_case': False}) with Transformer model: BertModel (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True}) ) ``` ## Usage ### Direct Usage (Sentence Transformers) First install the Sentence Transformers library: ```bash pip install -U sentence-transformers ``` Then you can load this model and run inference. ```python from sentence_transformers import SentenceTransformer # Download from the 🤗 Hub model = SentenceTransformer("alecocc/icd10-hard-negatives") # Run inference sentences = [ 'CAD', 'Atherosclerotic heart disease of native coronary artery with unspecified angina pectoris', 'Myopia, bilateral', ] embeddings = model.encode(sentences) print(embeddings.shape) # [3, 768] # Get the similarity scores for the embeddings similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] ``` ## Training Details ### Training Dataset #### mimic10-hard-negatives * Dataset: [mimic10-hard-negatives](https://huggingface.co/datasets/FrancescoBuda/mimic10-hard-negatives) at [ef88fe5](https://huggingface.co/datasets/FrancescoBuda/mimic10-hard-negatives/tree/ef88fe5f449aad48f89f31523c8731e0474d42c1) * Size: 473,546 training samples * Columns: anchor, positive, negative_1, negative_2, negative_3, negative_4, negative_5, negative_6, negative_7, negative_8, negative_9, and negative_10 * Approximate statistics based on the first 1000 samples: | | anchor | positive | negative_1 | negative_2 | negative_3 | negative_4 | negative_5 | negative_6 | negative_7 | negative_8 | negative_9 | negative_10 | |:--------|:---------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------| | type | string | string | string | string | string | string | string | string | string | string | string | string | | details | | | | | | | | | | | | | * Samples: | anchor | positive | negative_1 | negative_2 | negative_3 | negative_4 | negative_5 | negative_6 | negative_7 | negative_8 | negative_9 | negative_10 | |:-----------------------------------|:------------------------------------------------|:---------------------------------------------------------------------------|:-----------------------------------------------------------------------|:-----------------------------------------------|:------------------------------------------------------------------------------------------|:-----------------------------------------------------|:------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | Anterior exenteration | Malignant neoplasm of bladder neck | Malignant neoplasm of unspecified kidney, except renal pelvis | Malignant neoplasm of unspecified renal pelvis | Malignant neoplasm of left ureter | Malignant neoplasm of paraurethral glands | Malignant neoplasm of left renal pelvis | Unspecified kyphosis, cervical region | Unspecified superficial injuries of left back wall of thorax, initial encounter | Dome fracture of acetabulum | Other fracture of left great toe, initial encounter for open fracture | Unspecified fracture of upper end of unspecified radius, subsequent encounter for open fracture type IIIA, IIIB, or IIIC with malunion | | Atorvastatin | Hyperlipidemia, unspecified | Other lactose intolerance | Lipomatosis, not elsewhere classified | Mucopolysaccharidosis, type II | Hyperuricemia without signs of inflammatory arthritis and tophaceous disease | Volume depletion, unspecified | Glaucoma secondary to other eye disorders, unspecified eye | Fracture of one rib, left side, subsequent encounter for fracture with routine healing | Toxic effect of other tobacco and nicotine, accidental (unintentional), sequela | Puncture wound without foreign body of left ring finger with damage to nail | Nondisplaced fracture of epiphysis (separation) (upper) of unspecified femur, subsequent encounter for open fracture type IIIA, IIIB, or IIIC with nonunion | | Urostomy | Malignant neoplasm of bladder neck | Malignant neoplasm of urinary organ, unspecified | Malignant neoplasm of overlapping sites of urinary organs | Malignant neoplasm of left ureter | Malignant neoplasm of urethra | Malignant neoplasm of left renal pelvis | Indeterminate leprosy | Poisoning by other viral vaccines, accidental (unintentional) | Fracture of unspecified metatarsal bone(s), right foot, initial encounter for open fracture | Sprain of tarsometatarsal ligament of unspecified foot, subsequent encounter | Burn of first degree of multiple sites of left ankle and foot, initial encounter | * Loss: [MultipleNegativesRankingLoss](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters: ```json { "scale": 20.0, "similarity_fct": "cos_sim" } ``` ### Training Hyperparameters #### Non-Default Hyperparameters - `per_device_train_batch_size`: 128 - `per_device_eval_batch_size`: 128 - `learning_rate`: 2e-05 - `num_train_epochs`: 2 - `warmup_ratio`: 0.1 - `fp16`: True - `batch_sampler`: no_duplicates #### All Hyperparameters
Click to expand - `overwrite_output_dir`: False - `do_predict`: False - `eval_strategy`: no - `prediction_loss_only`: True - `per_device_train_batch_size`: 128 - `per_device_eval_batch_size`: 128 - `per_gpu_train_batch_size`: None - `per_gpu_eval_batch_size`: None - `gradient_accumulation_steps`: 1 - `eval_accumulation_steps`: None - `torch_empty_cache_steps`: None - `learning_rate`: 2e-05 - `weight_decay`: 0.0 - `adam_beta1`: 0.9 - `adam_beta2`: 0.999 - `adam_epsilon`: 1e-08 - `max_grad_norm`: 1.0 - `num_train_epochs`: 2 - `max_steps`: -1 - `lr_scheduler_type`: linear - `lr_scheduler_kwargs`: {} - `warmup_ratio`: 0.1 - `warmup_steps`: 0 - `log_level`: passive - `log_level_replica`: warning - `log_on_each_node`: True - `logging_nan_inf_filter`: True - `save_safetensors`: True - `save_on_each_node`: False - `save_only_model`: False - `restore_callback_states_from_checkpoint`: False - `no_cuda`: False - `use_cpu`: False - `use_mps_device`: False - `seed`: 42 - `data_seed`: None - `jit_mode_eval`: False - `use_ipex`: False - `bf16`: False - `fp16`: True - `fp16_opt_level`: O1 - `half_precision_backend`: auto - `bf16_full_eval`: False - `fp16_full_eval`: False - `tf32`: None - `local_rank`: 0 - `ddp_backend`: None - `tpu_num_cores`: None - `tpu_metrics_debug`: False - `debug`: [] - `dataloader_drop_last`: False - `dataloader_num_workers`: 0 - `dataloader_prefetch_factor`: None - `past_index`: -1 - `disable_tqdm`: False - `remove_unused_columns`: True - `label_names`: None - `load_best_model_at_end`: False - `ignore_data_skip`: False - `fsdp`: [] - `fsdp_min_num_params`: 0 - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False} - `fsdp_transformer_layer_cls_to_wrap`: None - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None} - `deepspeed`: None - `label_smoothing_factor`: 0.0 - `optim`: adamw_torch - `optim_args`: None - `adafactor`: False - `group_by_length`: False - `length_column_name`: length - `ddp_find_unused_parameters`: None - `ddp_bucket_cap_mb`: None - `ddp_broadcast_buffers`: False - `dataloader_pin_memory`: True - `dataloader_persistent_workers`: False - `skip_memory_metrics`: True - `use_legacy_prediction_loop`: False - `push_to_hub`: False - `resume_from_checkpoint`: None - `hub_model_id`: None - `hub_strategy`: every_save - `hub_private_repo`: False - `hub_always_push`: False - `gradient_checkpointing`: False - `gradient_checkpointing_kwargs`: None - `include_inputs_for_metrics`: False - `eval_do_concat_batches`: True - `fp16_backend`: auto - `push_to_hub_model_id`: None - `push_to_hub_organization`: None - `mp_parameters`: - `auto_find_batch_size`: False - `full_determinism`: False - `torchdynamo`: None - `ray_scope`: last - `ddp_timeout`: 1800 - `torch_compile`: False - `torch_compile_backend`: None - `torch_compile_mode`: None - `dispatch_batches`: None - `split_batches`: None - `include_tokens_per_second`: False - `include_num_input_tokens_seen`: False - `neftune_noise_alpha`: None - `optim_target_modules`: None - `batch_eval_metrics`: False - `eval_on_start`: False - `use_liger_kernel`: False - `eval_use_gather_object`: False - `batch_sampler`: no_duplicates - `multi_dataset_batch_sampler`: proportional
### Training Logs | Epoch | Step | Training Loss | |:------:|:----:|:-------------:| | 0.0270 | 100 | 4.1948 | | 0.0541 | 200 | 3.5402 | | 0.0811 | 300 | 3.2462 | | 0.1081 | 400 | 2.9691 | | 0.1351 | 500 | 2.788 | | 0.1622 | 600 | 2.5922 | | 0.1892 | 700 | 2.5648 | | 0.2162 | 800 | 2.4821 | | 0.2432 | 900 | 2.47 | | 0.2703 | 1000 | 2.3774 | | 0.2973 | 1100 | 2.3415 | | 0.3243 | 1200 | 2.2428 | | 0.3514 | 1300 | 2.2794 | | 0.3784 | 1400 | 2.2372 | | 0.4054 | 1500 | 2.2265 | | 0.4324 | 1600 | 2.2186 | | 0.4595 | 1700 | 2.2074 | | 0.4865 | 1800 | 2.159 | | 0.5135 | 1900 | 2.1903 | | 0.5405 | 2000 | 2.1328 | | 0.5676 | 2100 | 2.0685 | | 0.5946 | 2200 | 2.1249 | | 0.6216 | 2300 | 2.1321 | | 0.6486 | 2400 | 2.0725 | | 0.6757 | 2500 | 2.0913 | | 0.7027 | 2600 | 2.0192 | | 0.7297 | 2700 | 2.036 | | 0.7568 | 2800 | 1.9863 | | 0.7838 | 2900 | 2.0411 | | 0.8108 | 3000 | 1.9796 | | 0.8378 | 3100 | 2.0102 | | 0.8649 | 3200 | 1.8652 | | 0.8919 | 3300 | 1.0192 | | 0.9189 | 3400 | 0.9623 | | 0.9459 | 3500 | 0.957 | | 0.9730 | 3600 | 0.8579 | | 1.0 | 3700 | 0.7984 | | 1.0270 | 3800 | 0.6359 | | 1.0541 | 3900 | 0.7348 | | 1.0811 | 4000 | 0.6356 | | 1.1081 | 4100 | 0.6252 | | 1.1351 | 4200 | 0.6587 | | 1.1622 | 4300 | 0.602 | | 1.1892 | 4400 | 0.6803 | | 1.2162 | 4500 | 0.6204 | | 1.2432 | 4600 | 0.667 | | 1.2703 | 4700 | 0.6253 | | 1.2973 | 4800 | 0.5375 | | 1.3243 | 4900 | 0.6054 | | 1.3514 | 5000 | 0.4541 | | 1.3784 | 5100 | 0.5334 | | 1.4054 | 5200 | 0.6075 | | 1.4324 | 5300 | 0.5037 | | 1.4595 | 5400 | 0.4825 | | 1.4865 | 5500 | 0.5442 | | 1.5135 | 5600 | 0.4999 | | 1.5405 | 5700 | 0.6521 | | 1.5676 | 5800 | 0.5769 | | 1.5946 | 5900 | 0.5029 | | 1.6216 | 6000 | 0.5787 | | 1.6486 | 6100 | 0.5235 | | 1.6757 | 6200 | 0.5839 | | 1.7027 | 6300 | 0.5339 | | 1.7297 | 6400 | 0.5339 | | 1.7568 | 6500 | 0.4515 | | 1.7838 | 6600 | 0.5648 | | 1.8108 | 6700 | 0.4355 | | 1.8378 | 6800 | 0.5321 | | 1.8649 | 6900 | 0.4778 | | 1.8919 | 7000 | 0.4884 | | 1.9189 | 7100 | 0.5941 | | 1.9459 | 7200 | 0.5489 | | 1.9730 | 7300 | 0.444 | | 2.0 | 7400 | 0.4964 | ### Framework Versions - Python: 3.10.12 - Sentence Transformers: 3.2.1 - Transformers: 4.45.2 - PyTorch: 2.1.2+cu121 - Accelerate: 0.29.0.dev0 - Datasets: 2.18.0 - Tokenizers: 0.20.1 ## Citation ### BibTeX #### Sentence Transformers ```bibtex @inproceedings{reimers-2019-sentence-bert, title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks", author = "Reimers, Nils and Gurevych, Iryna", booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing", month = "11", year = "2019", publisher = "Association for Computational Linguistics", url = "https://arxiv.org/abs/1908.10084", } ``` #### MultipleNegativesRankingLoss ```bibtex @misc{henderson2017efficient, title={Efficient Natural Language Response Suggestion for Smart Reply}, author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil}, year={2017}, eprint={1705.00652}, archivePrefix={arXiv}, primaryClass={cs.CL} } ```