---
license: apache-2.0
language:
- fr
library_name: transformers
tags:
- t5
- orfeo
- pytorch
- pictograms
- translation
metrics:
- bleu
widget:
- text: "je mange une pomme"
  example_title: "A simple sentence"
- text: "je ne pense pas à toi"
  example_title: "Sentence with a negation"
- text: "il y a 2 jours, les gendarmes ont vérifié ma licence"
  example_title: "Sentence with a polylexical term"
---

# t2p-t5-large-orféo

*t2p-t5-large-orféo* is a text-to-pictograms translation model built by fine-tuning the [t5-large](https://huggingface.co/google-t5/t5-large) model on a dataset of pairs of transcriptions / pictogram token sequence (each token is linked to a pictogram image from [ARASAAC](https://arasaac.org/)).

## Training details

### Datasets

### Parameters

### Evaluation

### Results

### Environmental Impact


## Using t2p-t5-large-orféo model with HuggingFace transformers

```python
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

source_lang = "fr"
target_lang = "frp"
max_input_length = 128
max_target_length = 128

tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForSeq2SeqLM.from_pretrained(checkpoint)

inputs = tokenizer("Je mange une pomme", return_tensors="pt").input_ids
outputs = model.generate(inputs.to("cuda:0"), max_new_tokens=40, do_sample=True, top_k=30, top_p=0.95)
pred = tokenizer.decode(outputs[0], skip_special_tokens=True)
```

## Information

- **Language(s):** French
- **License:** Apache-2.0
- **Developed by:** Cécile Macaire
- **Funded by**
  - GENCI-IDRIS (Grant 2023-AD011013625R1)
  - PROPICTO ANR-20-CE93-0005
- **Authors**
  - Cécile Macaire
  - Chloé Dion
  - Emmanuelle Esperança-Rodier
  - Benjamin Lecouteux
  - Didier Schwab


## Citation

If you use this model for your own research work, please cite as follows:

```bibtex

```