danielheinz
/

e5-base-sts-en-de

Feature Extraction

text-embeddings-inference

Inference Endpoints

Model card Files Files and versions Community

INFO: The model is being continuously updated.

The model is a multilingual-e5-base model fine-tuned with the task of semantic textual similarity in mind.

Model Training

The model has been fine-tuned on the German subsets of the following datasets:

The training procedure can be divided into two stages:

training on paraphrase datasets with the Multiple Negatives Ranking Loss
training on semantic textual similarity datasets using the Cosine Similarity Loss

Results

The model achieves the following results:

0.920 on stsb's validation subset
0.904 on stsb's test subset

Downloads last month: 21,288

Safetensors

Model size

278M params

Tensor type

F32

·

Inference Examples

Feature Extraction

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train danielheinz/e5-base-sts-en-de

Spaces using danielheinz/e5-base-sts-en-de 3

Evaluation results

spearmanr on stsb_multi_mt
self-reported

0.904

View on Papers With Code