Propicto
/

t2p-t5-large-orfeo

text2text-generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

cecilemacaire commited on Jul 4, 2024

Commit

a63edff

·

verified ·

1 Parent(s): 0e1dea6

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -23,6 +23,7 @@ widget:
 # t2p-t5-large-orféo
 *t2p-t5-large-orféo* is a text-to-pictograms translation model built by fine-tuning the [t5-large](https://huggingface.co/google-t5/t5-large) model on a dataset of pairs of transcriptions / pictogram token sequence (each token is linked to a pictogram image from [ARASAAC](https://arasaac.org/)).
 ## Training details
@@ -30,7 +31,6 @@ widget:
 The [Propicto-orféo dataset](https://www.ortolang.fr/market/corpora/propicto) is used, which was created from the CEFC-Orféo corpus.
 This dataset was presented in the research paper titled ["A Multimodal French Corpus of Aligned Speech, Text, and Pictogram Sequences for Speech-to-Pictogram Machine Translation](https://aclanthology.org/2024.lrec-main.76/)" at LREC-Coling 2024. The dataset was split into training, validation, and test sets.
 | **Split** | **Number of utterances** |
 |:-----------:|:-----------------------:|
 | train | 231,374 |

 # t2p-t5-large-orféo
 *t2p-t5-large-orféo* is a text-to-pictograms translation model built by fine-tuning the [t5-large](https://huggingface.co/google-t5/t5-large) model on a dataset of pairs of transcriptions / pictogram token sequence (each token is linked to a pictogram image from [ARASAAC](https://arasaac.org/)).
+The model is used only for **inference**.
 ## Training details
 The [Propicto-orféo dataset](https://www.ortolang.fr/market/corpora/propicto) is used, which was created from the CEFC-Orféo corpus.
 This dataset was presented in the research paper titled ["A Multimodal French Corpus of Aligned Speech, Text, and Pictogram Sequences for Speech-to-Pictogram Machine Translation](https://aclanthology.org/2024.lrec-main.76/)" at LREC-Coling 2024. The dataset was split into training, validation, and test sets.
 | **Split** | **Number of utterances** |
 |:-----------:|:-----------------------:|
 | train | 231,374 |