Fill-Mask
Transformers
PyTorch
eurobert
code
custom_code

EuroBERT-210m

EuroBERT

Table of Contents

  1. Overview
  2. Usage
  3. Evaluation
  4. License
  5. Citation

Overview

EuroBERT is a family of multilingual encoder models designed for a variety of tasks such as retrieval, classification and regression supporting 15 languages, mathematics and code, supporting sequences of up to 8,192 tokens. EuroBERT models exhibit the strongest multilingual performance across domains and tasks compared to similarly sized systems.

It is available in 3 sizes:

For more information about EuroBERT, please check our blog post and the arXiv preprint.

Usage

from transformers import AutoTokenizer, AutoModelForMaskedLM

model_id = "EuroBERT/EuroBERT-2.1B"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForMaskedLM.from_pretrained(model_id, trust_remote_code=True)

text = "The capital of France is <|mask|>."
inputs = tokenizer(text, return_tensors="pt")
outputs = model(**inputs)

# To get predictions for the mask:
masked_index = inputs["input_ids"][0].tolist().index(tokenizer.mask_token_id)
predicted_token_id = outputs.logits[0, masked_index].argmax(axis=-1)
predicted_token = tokenizer.decode(predicted_token_id)
print("Predicted token:", predicted_token)
# Predicted token:  Paris

💻 You can use these models directly with the transformers library starting from v4.48.0:

pip install -U transformers>=4.48.0

🏎️ If your GPU supports it, we recommend using EuroBERT with Flash Attention 2 to achieve the highest efficiency. To do so, install Flash Attention 2 as follows, then use the model as normal:

pip install flash-attn

Evaluation

We evaluate EuroBERT on a suite of tasks to cover various real-world use cases for multilingual encoders, including retrieval performance, classification, sequence regression, quality estimation, summary evaluation, code-related tasks, and mathematical tasks.

Key highlights: The EuroBERT family exhibits strong multilingual performance across domains and tasks.

  • EuroBERT-2.1B, our largest model, achieves the highest performance among all evaluated systems. It outperforms the largest system, XLM-RoBERTa-XL.

  • EuroBERT-610m is competitive with XLM-RoBERTa-XL, a model 5 times its size, on most multilingual tasks and surpasses it in code and mathematics tasks.

  • The smaller EuroBERT-210m generally outperforms all similarly sized systems.

EuroBERT
EuroBERT
EuroBERT

License

We release the EuroBERT model architectures, model weights, and training codebase under the Apache 2.0 license.

Citation

If you use EuroBERT in your work, please cite:

@misc{boizard2025eurobertscalingmultilingualencoders,
      title={EuroBERT: Scaling Multilingual Encoders for European Languages}, 
      author={Nicolas Boizard and Hippolyte Gisserot-Boukhlef and Duarte M. Alves and André Martins and Ayoub Hammal and Caio Corro and Céline Hudelot and Emmanuel Malherbe and Etienne Malaboeuf and Fanny Jourdan and Gabriel Hautreux and João Alves and Kevin El-Haddad and Manuel Faysse and Maxime Peyrard and Nuno M. Guerreiro and Patrick Fernandes and Ricardo Rei and Pierre Colombo},
      year={2025},
      eprint={2503.05500},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2503.05500}, 
}
Downloads last month
0
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The HF Inference API does not support model that require custom code execution.