File size: 5,108 Bytes
4dd054a
 
3b5c6f6
4dd054a
 
36468a5
 
4dd054a
 
 
 
 
 
 
 
 
 
3b5c6f6
36468a5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4dd054a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3b5c6f6
4dd054a
 
3b5c6f6
 
4dd054a
 
3b5c6f6
4dd054a
 
 
36468a5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4dd054a
 
 
 
3b5c6f6
4dd054a
 
3b5c6f6
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
---
license: other
base_model: facebook/opt-1.3b
tags:
- generated_from_trainer
metrics:
- accuracy
model-index:
- name: reward_modeling_anthropic_hh
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# reward_modeling_anthropic_hh

This model is a fine-tuned version of [facebook/opt-1.3b](https://huggingface.co/facebook/opt-1.3b) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 0.6907
- Accuracy: 0.6825
- Train Rewards/chosen: -1.8222
- Train Rewards/rejected: -3.6005
- Train Rewards/accuracies: 0.8138
- Train Rewards/margins: 1.7783
- Train Nll Loss: 2.4635
- Train  Logit Total Loss: 0.4241
- Train  Logit Loss: 0.4035
- Rewards/chosen: -2.0106
- Rewards/rejected: -3.0639
- Rewards/accuracies: 0.6657
- Rewards/margins: 1.0533
- Nll Loss: 2.4906
-  Logit Total Loss: 0.6892
-  Logit Loss: 0.6710

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 1.41e-05
- train_batch_size: 4
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 16
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 2

### Training results

| Training Loss | Epoch | Step | Validation Loss | Accuracy | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Nll Loss |  Logit Total Loss |  Logit Loss |
|:-------------:|:-----:|:----:|:---------------:|:--------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------:|:-----------------:|:-----------:|
| 0.7169        | 0.11  | 100  | 0.6921          | 0.5959   | -1.7367        | -1.8694          | 0.5855             | 0.1326          | 3.0057   | 0.6899            | 0.6665      |
| 0.7082        | 0.23  | 200  | 0.6978          | 0.5938   | -3.3995        | -3.5818          | 0.5802             | 0.1823          | 3.2073   | 0.6959            | 0.6706      |
| 0.6744        | 0.34  | 300  | 0.6681          | 0.6062   | -2.3751        | -2.7036          | 0.5956             | 0.3285          | 2.7061   | 0.6656            | 0.6450      |
| 0.6154        | 0.46  | 400  | 0.6490          | 0.6433   | -1.5136        | -1.9306          | 0.6310             | 0.4171          | 2.8065   | 0.6474            | 0.6256      |
| 0.6405        | 0.57  | 500  | 0.6573          | 0.6351   | -1.4041        | -1.8257          | 0.6226             | 0.4216          | 2.6995   | 0.6577            | 0.6371      |
| 0.6284        | 0.69  | 600  | 0.6448          | 0.6557   | -2.3215        | -2.7092          | 0.6440             | 0.3877          | 2.6968   | 0.6433            | 0.6225      |
| 0.6399        | 0.8   | 700  | 0.6454          | 0.6227   | -2.0755        | -2.4642          | 0.6125             | 0.3887          | 2.8089   | 0.6435            | 0.6217      |
| 0.669         | 0.91  | 800  | 0.6385          | 0.6474   | -1.7053        | -2.1240          | 0.6379             | 0.4187          | 2.6687   | 0.6350            | 0.6145      |
| 0.4788        | 1.03  | 900  | 0.6636          | 0.6577   | -2.1522        | -2.8529          | 0.6435             | 0.7007          | 2.5723   | 0.6620            | 0.6427      |
| 0.4529        | 1.14  | 1000 | 0.6938          | 0.6577   | -1.1456        | -2.0167          | 0.6488             | 0.8712          | 2.5628   | 0.6897            | 0.6708      |
| 0.4378        | 1.26  | 1100 | 0.7319          | 0.6536   | -1.4771        | -2.4829          | 0.6427             | 1.0058          | 2.5495   | 0.7282            | 0.7098      |
| 0.4496        | 1.37  | 1200 | 0.7034          | 0.6660   | -2.6046        | -3.5817          | 0.6524             | 0.9771          | 2.5483   | 0.7006            | 0.6819      |
| 0.3539        | 1.49  | 1300 | 0.7023          | 0.6598   | -2.2279        | -3.2122          | 0.6516             | 0.9842          | 2.5144   | 0.6963            | 0.6780      |
| 0.5494        | 1.6   | 1400 | 0.6784          | 0.6536   | -2.3300        | -3.3018          | 0.6435             | 0.9718          | 2.4946   | 0.6749            | 0.6565      |
| 0.4075        | 1.71  | 1500 | 0.6935          | 0.6948   | -0.9575        | -2.0411          | 0.6843             | 1.0836          | 2.4900   | 0.6884            | 0.6702      |
| 0.4789        | 1.83  | 1600 | 0.6941          | 0.6598   | -2.1270        | -3.1756          | 0.6496             | 1.0487          | 2.5026   | 0.6924            | 0.6741      |
| 0.4093        | 1.94  | 1700 | 0.6907          | 0.6825   | -2.0106        | -3.0639          | 0.6657             | 1.0533          | 2.4906   | 0.6892            | 0.6710      |


### Framework versions

- Transformers 4.37.2
- Pytorch 2.4.0+cu121
- Datasets 2.21.0
- Tokenizers 0.15.2