t2p-t5-large-orfeo / README.md
cecilemacaire's picture
Update README.md
3f18d74 verified
|
raw
history blame
1.83 kB
metadata
license: apache-2.0
language:
  - fr
library_name: transformers
tags:
  - t5
  - orfeo
  - pytorch
  - pictograms
  - translation
metrics:
  - bleu
widget:
  - text: je mange une pomme
    example_title: A simple sentence
  - text: je ne pense pas à toi
    example_title: Sentence with a negation
  - text: il y a 2 jours, les gendarmes ont vérifié ma licence
    example_title: Sentence with a polylexical term

t2p-t5-large-orféo

t2p-t5-large-orféo is a text-to-pictograms translation model built by fine-tuning the t5-large model on a dataset of pairs of transcriptions / pictogram token sequence (each token is linked to a pictogram image from ARASAAC).

Training details

Datasets

Parameters

Evaluation

Results

Environmental Impact

Using t2p-t5-large-orféo model with HuggingFace transformers

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

source_lang = "fr"
target_lang = "frp"
max_input_length = 128
max_target_length = 128

tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForSeq2SeqLM.from_pretrained(checkpoint)

inputs = tokenizer("Je mange une pomme", return_tensors="pt").input_ids
outputs = model.generate(inputs.to("cuda:0"), max_new_tokens=40, do_sample=True, top_k=30, top_p=0.95)
pred = tokenizer.decode(outputs[0], skip_special_tokens=True)
  • Language(s): French
  • License: Apache-2.0
  • Developed by: Cécile Macaire
  • Funded by
    • GENCI-IDRIS (Grant 2023-AD011013625R1)
    • PROPICTO ANR-20-CE93-0005
  • Authors
    • Cécile Macaire
    • Chloé Dion
    • Emmanuelle Esperança-Rodier
    • Benjamin Lecouteux
    • Didier Schwab

Citation

If you use this model for your own research work, please cite as follows: