A manually pruned version of distilbert-portuguese-cased, finetuned to produce high quality embeddings in a lightweight form factor.

Model Trained Using AutoTrain

  • Problem type: Sentence Transformers

Validation Metrics

loss: 0.3181200921535492

cosine_accuracy: 0.8921948650328134

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the Hugging Face Hub
model = SentenceTransformer("cnmoro/micro-bertim-embeddings")
# Run inference
sentences = [
    'O pôr do sol pinta o céu com tons de laranja e vermelho',
    'Joana adora estudar matemática nas tardes de sábado',
    'Os pássaros voam em formação, criando um espetáculo no horizonte',
]
embeddings = model.encode(sentences)
print(embeddings.shape)

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
Downloads last month
15
Safetensors
Model size
4.43M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for cnmoro/micro-bertim-embeddings

Finetuned
(4)
this model

Dataset used to train cnmoro/micro-bertim-embeddings