IteraTeR PEGASUS model
This model was obtained by fine-tuning google/pegasus-large on IteraTeR-full-sent dataset.
Paper: Understanding Iterative Revision from Human-Written Text
Authors: Wanyu Du, Vipul Raheja, Dhruv Kumar, Zae Myung Kim, Melissa Lopez, Dongyeop Kang
Text Revision Task
Given an edit intention and an original sentence, our model can generate a revised sentence.
The edit intentions are provided by IteraTeR-full-sent dataset, which are categorized as follows:
Edit Intention | Definition | Example |
---|---|---|
clarity | Make the text more formal, concise, readable and understandable. |
Original: It's like a house which anyone can enter in it. Revised: It's like a house which anyone can enter. |
fluency | Fix grammatical errors in the text. |
Original: In the same year he became the Fellow of the Royal Society. Revised: In the same year, he became the Fellow of the Royal Society. |
coherence | Make the text more cohesive, logically linked and consistent as a whole. |
Original: Achievements and awards Among his other activities, he founded the Karachi Film Guild and Pakistan Film and TV Academy. Revised: Among his other activities, he founded the Karachi Film Guild and Pakistan Film and TV Academy. |
style | Convey the writer’s writing preferences, including emotions, tone, voice, etc.. |
Original: She was last seen on 2005-10-22. Revised: She was last seen on October 22, 2005. |
Usage
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("wanyu/IteraTeR-PEGASUS-Revision-Generator")
model = AutoModelForSeq2SeqLM.from_pretrained("wanyu/IteraTeR-PEGASUS-Revision-Generator")
before_input = '<fluency> I likes coffee.'
model_input = tokenizer(before_input, return_tensors='pt')
model_outputs = model.generate(**model_input, num_beams=8, max_length=1024)
after_text = tokenizer.batch_decode(model_outputs, skip_special_tokens=True)[0]
- Downloads last month
- 15
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.