weqweasdas
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -10,6 +10,8 @@
|
|
10 |
|
11 |
The reward model is trained from the base model [google/gemma-2b-it](https://huggingface.co/google/gemma-2b-it). See the 7B version [RM-Gemma-7B](https://huggingface.co/weqweasdas/RM-Gemma-7B).
|
12 |
|
|
|
|
|
13 |
## Model Details
|
14 |
|
15 |
If you have any question with this reward model and also any question about reward modeling, feel free to drop me an email with [email protected]. I would be happy to chat!
|
|
|
10 |
|
11 |
The reward model is trained from the base model [google/gemma-2b-it](https://huggingface.co/google/gemma-2b-it). See the 7B version [RM-Gemma-7B](https://huggingface.co/weqweasdas/RM-Gemma-7B).
|
12 |
|
13 |
+
The training script is available at https://github.com/WeiXiongUST/RLHF-Reward-Modeling .
|
14 |
+
|
15 |
## Model Details
|
16 |
|
17 |
If you have any question with this reward model and also any question about reward modeling, feel free to drop me an email with [email protected]. I would be happy to chat!
|