amirabdullah19852020 commited on
Commit
460c39e
1 Parent(s): 5ab78a7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -15,7 +15,7 @@ should probably proofread and complete it, then remove this comment. -->
15
 
16
  # gpt-neo-125m_hh_reward
17
 
18
- This model is a fine-tuned version of [EleutherAI/gpt-neo-125m](https://huggingface.co/EleutherAI/gpt-neo-125m) on an unknown dataset.
19
  It achieves the following results on the evaluation set:
20
  - Loss: 0.7764
21
  - Rewards/chosen: -1.0726
 
15
 
16
  # gpt-neo-125m_hh_reward
17
 
18
+ This model is a DPO fine-tuned version of [EleutherAI/gpt-neo-125m](https://huggingface.co/EleutherAI/gpt-neo-125m) on Anthropics HH dataset.
19
  It achieves the following results on the evaluation set:
20
  - Loss: 0.7764
21
  - Rewards/chosen: -1.0726