---
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:1195425
- loss:MSELoss
base_model: mixedbread-ai/mxbai-embed-large-v1
widget:
- source_sentence: >-
At an outdoor event in an Asian-themed area, a crowd congregates as one
person in a yellow Chinese dragon costume confronts the camera.
sentences:
- Boy dressed in blue holds a toy.
- A man is smiling at his wife.
- Two young asian men are squatting.
- source_sentence: A man with a shopping cart is studying the shelves in a supermarket aisle.
sentences:
- the animal is running
- The children are watching TV at home.
- >-
Three young boys one is holding a camera and another is holding a green toy
all are wearing t-shirt and smiling.
- source_sentence: The door is open.
sentences:
- A girl is using an apple laptop with her headphones in her ears.
- >-
There are three men in this picture, two are on motorbikes, one of the men
has a large piece of furniture on the back of his bike, the other is about
to be handed a piece of paper by a man in a white shirt.
- >-
A large group of people are gathered outside of a brick building lit with
spotlights.
- source_sentence: >-
A small group of children are standing in a classroom and one of them has a
foot in a trashcan, which also has a rope leading out of it.
sentences:
- People are playing music.
- Children are swimming at the beach.
- Women are celebrating at a bar.
- source_sentence: >-
A black dog is drinking next to a brown and white dog that is looking at an
orange ball in the lake, whilst a horse and rider passes behind.
sentences:
- Some men with jerseys are in a bar, watching a soccer match.
- the guy is dead
- >-
There are two people running around a track in lane three and the one
wearing a blue shirt with a green thing over the eyes is just barely ahead
of the guy wearing an orange shirt and sunglasses.
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
- pearson_cosine
- spearman_cosine
- negative_mse
model-index:
- name: SentenceTransformer based on mixedbread-ai/mxbai-embed-large-v1
results:
- task:
type: semantic-similarity
name: Semantic Similarity
dataset:
name: sts dev
type: sts-dev
metrics:
- type: pearson_cosine
value: 0.8654028138219636
name: Pearson Cosine
- type: spearman_cosine
value: 0.8873087539713633
name: Spearman Cosine
- task:
type: knowledge-distillation
name: Knowledge Distillation
dataset:
name: Unknown
type: unknown
metrics:
- type: negative_mse
value: -3.3795181661844254
name: Negative Mse
- task:
type: semantic-similarity
name: Semantic Similarity
dataset:
name: sts test
type: sts-test
metrics:
- type: pearson_cosine
value: 0.834023412201456
name: Pearson Cosine
- type: spearman_cosine
value: 0.8723901159121923
name: Spearman Cosine
license: apache-2.0
language:
- en
---
# SentenceTransformer based on Model Distillation
In this experiment with knowledge distillation for embedding models, i retained 8 layers from the teacher model. This is an attempt to create a lighter, faster version.
- the top left graph shows how well your model's predictions match reality. Spearman correlation = 0.887
- the top right compares the correlation performance of this model vs the reference(mxbai-embed-large-v1) model - both bars around 0.8-0.9
- bottom left shows, this model processes about 45 samples/s and mxbai-embed-large-v1 processes about 30 samples/s.
- the bottom right shows a small accuracy drop for this model.
![image/png](https://cdn-uploads.huggingface.co/production/uploads/650a93c23449d9a49c356aab/LkqDmk0wMOpmjihgJYw6G.png)
## Model Details
### Model Description
- **Model Type:** Sentence Transformer
- **Base model:** [mixedbread-ai/mxbai-embed-large-v1](https://huggingface.co/mixedbread-ai/mxbai-embed-large-v1)
- **Maximum Sequence Length:** 512 tokens
- **Output Dimensionality:** 1024 dimensions
- **Similarity Function:** Cosine Similarity
### Full Model Architecture
```
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
```
## Usage
### Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
```bash
pip install -U sentence-transformers
```
Then you can load this model and run inference.
```python
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'A black dog is drinking next to a brown and white dog that is looking at an orange ball in the lake, whilst a horse and rider passes behind.',
'Some men with jerseys are in a bar, watching a soccer match.',
'There are two people running around a track in lane three and the one wearing a blue shirt with a green thing over the eyes is just barely ahead of the guy wearing an orange shirt and sunglasses.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
```
## Evaluation
### Metrics
#### Semantic Similarity
* Datasets: `sts-dev` and `sts-test`
* Evaluated with [EmbeddingSimilarityEvaluator
](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
| Metric | sts-dev | sts-test |
|:--------------------|:-----------|:-----------|
| pearson_cosine | 0.8654 | 0.834 |
| **spearman_cosine** | **0.8873** | **0.8724** |
#### Knowledge Distillation
* Evaluated with [MSEEvaluator
](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.MSEEvaluator)
| Metric | Value |
|:-----------------|:------------|
| **negative_mse** | **-3.3795** |
## Training Details
### Training Dataset
#### Unnamed Dataset
* Size: 1,195,425 training samples
* Columns: sentence
and label
* Approximate statistics based on the first 1000 samples:
| | sentence | label |
|:--------|:----------------------------------------------------------------------------------|:--------------------------------------|
| type | string | list |
| details |
A person on a horse jumps over a broken down airplane.
| [-0.012967385351657867, 0.3716000020503998, 0.252520889043808, 0.7052643299102783, -0.15118499100208282, ...]
|
| Children smiling and waving at camera
| [0.15414997935295105, 0.6666896939277649, -0.3150098919868469, 1.0102407932281494, 0.4113735556602478, ...]
|
| A boy is jumping on skateboard in the middle of a red bridge.
| [-0.2989530563354492, 0.8571284413337708, -0.48532426357269287, 0.8935043215751648, 0.28524795174598694, ...]
|
* Loss: [MSELoss
](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#mseloss)
### Evaluation Dataset
#### Unnamed Dataset
* Size: 10,000 evaluation samples
* Columns: sentence
and label
* Approximate statistics based on the first 1000 samples:
| | sentence | label |
|:--------|:----------------------------------------------------------------------------------|:--------------------------------------|
| type | string | list |
| details | Two women are embracing while holding to go packages.
| [-0.35094621777534485, 0.4337681233882904, 0.22905530035495758, 0.9438946843147278, -1.0199058055877686, ...]
|
| Two young children in blue jerseys, one with the number 9 and one with the number 2 are standing on wooden steps in a bathroom and washing their hands in a sink.
| [-0.37593328952789307, 0.6690596342086792, -0.14921458065509796, 0.7559019923210144, -0.4093412756919861, ...]
|
| A man selling donuts to a customer during a world exhibition event held in the city of Angeles
| [0.21969863772392273, 0.5065202713012695, -0.25664886832237244, 0.2569092810153961, -0.05940837413072586, ...]
|
* Loss: [MSELoss
](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#mseloss)
### Training Hyperparameters
#### Non-Default Hyperparameters
- `eval_strategy`: steps
- `per_device_train_batch_size`: 64
- `per_device_eval_batch_size`: 64
- `learning_rate`: 0.0001
- `num_train_epochs`: 1
- `warmup_ratio`: 0.1
- `fp16`: True
- `load_best_model_at_end`: True
#### All Hyperparameters