MahmoudMohamed commited on
Commit
9576c36
·
verified ·
1 Parent(s): d91e7be

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -15
README.md CHANGED
@@ -17,25 +17,11 @@ should probably proofread and complete it, then remove this comment. -->
17
 
18
  # Reward_model
19
 
20
- This model is a fine-tuned version of [OpenAssistant/reward-model-deberta-v3-base](https://huggingface.co/OpenAssistant/reward-model-deberta-v3-base) on an unknown dataset.
21
  It achieves the following results on the evaluation set:
22
  - Loss: 0.6931
23
  - Accuracy: 1.0
24
 
25
- ## Model description
26
-
27
- More information needed
28
-
29
- ## Intended uses & limitations
30
-
31
- More information needed
32
-
33
- ## Training and evaluation data
34
-
35
- More information needed
36
-
37
- ## Training procedure
38
-
39
  ### Training hyperparameters
40
 
41
  The following hyperparameters were used during training:
 
17
 
18
  # Reward_model
19
 
20
+ This model is a fine-tuned version of [OpenAssistant/reward-model-deberta-v3-base](https://huggingface.co/OpenAssistant/reward-model-deberta-v3-base) on [Anthropic/hh-rlhf](https://huggingface.co/datasets/Anthropic/hh-rlhf) dataset.
21
  It achieves the following results on the evaluation set:
22
  - Loss: 0.6931
23
  - Accuracy: 1.0
24
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
25
  ### Training hyperparameters
26
 
27
  The following hyperparameters were used during training: