19Leo97 commited on
Commit
8d8942e
1 Parent(s): a55c758

19Leo97/openhermes-mistral-dpo-gptq

Browse files
README.md CHANGED
@@ -18,15 +18,15 @@ should probably proofread and complete it, then remove this comment. -->
18
 
19
  This model is a fine-tuned version of [TheBloke/OpenHermes-2-Mistral-7B-GPTQ](https://huggingface.co/TheBloke/OpenHermes-2-Mistral-7B-GPTQ) on the None dataset.
20
  It achieves the following results on the evaluation set:
21
- - Loss: 0.6864
22
- - Rewards/chosen: -0.0176
23
- - Rewards/rejected: -0.0077
24
- - Rewards/accuracies: 0.3125
25
- - Rewards/margins: -0.0098
26
- - Logps/rejected: -168.4619
27
- - Logps/chosen: -127.7056
28
- - Logits/rejected: -2.4072
29
- - Logits/chosen: -2.3077
30
 
31
  ## Model description
32
 
@@ -59,11 +59,11 @@ The following hyperparameters were used during training:
59
 
60
  | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
61
  |:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
62
- | 0.6768 | 0.005 | 10 | 0.6965 | 0.0206 | -0.0070 | 0.5625 | 0.0276 | -168.4548 | -127.3242 | -2.4001 | -2.3165 |
63
- | 0.7062 | 0.01 | 20 | 0.7131 | 0.0043 | 0.0421 | 0.25 | -0.0378 | -167.9636 | -127.4874 | -2.4078 | -2.3156 |
64
- | 0.7526 | 0.015 | 30 | 0.7101 | -0.0048 | 0.0478 | 0.125 | -0.0526 | -167.9063 | -127.5779 | -2.4088 | -2.3121 |
65
- | 0.6946 | 0.02 | 40 | 0.7003 | -0.0144 | 0.0180 | 0.1875 | -0.0324 | -168.2044 | -127.6736 | -2.4087 | -2.3099 |
66
- | 0.6728 | 0.025 | 50 | 0.6864 | -0.0176 | -0.0077 | 0.3125 | -0.0098 | -168.4619 | -127.7056 | -2.4072 | -2.3077 |
67
 
68
 
69
  ### Framework versions
@@ -71,5 +71,5 @@ The following hyperparameters were used during training:
71
  - PEFT 0.11.1
72
  - Transformers 4.41.1
73
  - Pytorch 2.0.1+cu117
74
- - Datasets 2.19.1
75
  - Tokenizers 0.19.1
 
18
 
19
  This model is a fine-tuned version of [TheBloke/OpenHermes-2-Mistral-7B-GPTQ](https://huggingface.co/TheBloke/OpenHermes-2-Mistral-7B-GPTQ) on the None dataset.
20
  It achieves the following results on the evaluation set:
21
+ - Loss: 0.7067
22
+ - Rewards/chosen: 0.0088
23
+ - Rewards/rejected: -0.0947
24
+ - Rewards/accuracies: 0.625
25
+ - Rewards/margins: 0.1035
26
+ - Logps/rejected: -172.7847
27
+ - Logps/chosen: -98.3108
28
+ - Logits/rejected: -2.0623
29
+ - Logits/chosen: -1.9279
30
 
31
  ## Model description
32
 
 
59
 
60
  | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
61
  |:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
62
+ | 0.6744 | 0.005 | 10 | 0.6970 | -0.0186 | -0.0266 | 0.6875 | 0.0080 | -172.1035 | -98.5849 | -2.0765 | -1.9425 |
63
+ | 0.7073 | 0.01 | 20 | 0.7152 | -0.0388 | -0.0448 | 0.4375 | 0.0060 | -172.2850 | -98.7869 | -2.0706 | -1.9392 |
64
+ | 0.7287 | 0.015 | 30 | 0.7197 | 0.0026 | -0.0203 | 0.625 | 0.0230 | -172.0406 | -98.3726 | -2.0688 | -1.9317 |
65
+ | 0.701 | 0.02 | 40 | 0.7120 | 0.0131 | -0.0600 | 0.625 | 0.0731 | -172.4374 | -98.2679 | -2.0641 | -1.9302 |
66
+ | 0.6726 | 0.025 | 50 | 0.7067 | 0.0088 | -0.0947 | 0.625 | 0.1035 | -172.7847 | -98.3108 | -2.0623 | -1.9279 |
67
 
68
 
69
  ### Framework versions
 
71
  - PEFT 0.11.1
72
  - Transformers 4.41.1
73
  - Pytorch 2.0.1+cu117
74
+ - Datasets 2.19.2
75
  - Tokenizers 0.19.1
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:8c24e16461a1fb6569c731d9d60bbd17dc63e98b71bb9cdf86f6b23b09861e1f
3
  size 13648432
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8d24fbf15b6d1c08754caa23d6cff841eb2ec00b8804c645102f1a961a153379
3
  size 13648432
runs/Jun05_00-06-45_81d7482dc3a4/events.out.tfevents.1717546091.81d7482dc3a4.150.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:38005ff57a0a7f0aff80f77672a79ff1dd52cd87a9e71ddf8513220837b91783
3
+ size 7620
runs/Jun05_00-11-51_81d7482dc3a4/events.out.tfevents.1717546405.81d7482dc3a4.2154.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cf86c27a2fd23a7881d7d5f794310fdea167e2c4915c971305fc1807c78e13ec
3
+ size 13580
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:61e5075654834112cd13ca383fda6d7ab683d28680cdb6816851a7098415930d
3
  size 4667
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:17b3b89cf7949bcde9f449c35cc3b2ceeffdfd8ba0c756ce25befe381610c65b
3
  size 4667