File size: 426 Bytes
08ef2bd
 
 
 
 
 
 
6293cb8
08ef2bd
 
1
2
3
4
5
6
7
8
9
10
11
---
library_name: transformers
tags: []
---

# Model Card for Model ID

We fine-tune OPT-125 to generate positive movie reviews based on the IMDB dataset. The model gets the start of a real review and is tasked to produce positive continuations. To reward positive continuations we use a BERT classifier to analyse the sentiment of the produced sentences and use the classifier's outputs as rewards signals for PPO training