File size: 9,588 Bytes
8e0d022
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
---
license: mit
base_model: TheBloke/zephyr-7B-beta-GPTQ
tags:
- trl
- sft
- generated_from_trainer
metrics:
- rouge
model-index:
- name: zephyr-support-chatbot
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# zephyr-support-chatbot

This model is a fine-tuned version of [TheBloke/zephyr-7B-beta-GPTQ](https://huggingface.co/TheBloke/zephyr-7B-beta-GPTQ) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 1.9415
- Rouge1: 0.7212
- Rouge2: 0.5438
- Rougel: 0.6914
- Rougelsum: 0.7070

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 16
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- num_epochs: 100
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
|:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|
| 2.025         | 1.11  | 10   | 1.7161          | 0.5886 | 0.2570 | 0.5121 | 0.5475    |
| 1.1915        | 2.22  | 20   | 1.2651          | 0.6729 | 0.4745 | 0.6406 | 0.6543    |
| 0.9665        | 3.33  | 30   | 1.0929          | 0.6952 | 0.4754 | 0.6673 | 0.6810    |
| 0.7734        | 4.44  | 40   | 0.9885          | 0.7230 | 0.5207 | 0.6976 | 0.7041    |
| 0.6012        | 5.56  | 50   | 0.9900          | 0.7342 | 0.5549 | 0.7101 | 0.7208    |
| 0.5071        | 6.67  | 60   | 0.9744          | 0.7380 | 0.5513 | 0.7143 | 0.7256    |
| 0.3746        | 7.78  | 70   | 1.0405          | 0.7381 | 0.5550 | 0.7137 | 0.7290    |
| 0.275         | 8.89  | 80   | 1.1184          | 0.7430 | 0.5602 | 0.7193 | 0.7289    |
| 0.2043        | 10.0  | 90   | 1.1665          | 0.7352 | 0.5511 | 0.7062 | 0.7193    |
| 0.1503        | 11.11 | 100  | 1.2355          | 0.7368 | 0.5586 | 0.7077 | 0.7234    |
| 0.1163        | 12.22 | 110  | 1.3027          | 0.7298 | 0.5502 | 0.7023 | 0.7155    |
| 0.099         | 13.33 | 120  | 1.3860          | 0.7274 | 0.5558 | 0.7034 | 0.7144    |
| 0.0824        | 14.44 | 130  | 1.4979          | 0.7298 | 0.5562 | 0.7033 | 0.7172    |
| 0.084         | 15.56 | 140  | 1.4817          | 0.7284 | 0.5452 | 0.6991 | 0.7157    |
| 0.0745        | 16.67 | 150  | 1.4627          | 0.7187 | 0.5376 | 0.6940 | 0.7062    |
| 0.0666        | 17.78 | 160  | 1.4370          | 0.7306 | 0.5498 | 0.7004 | 0.7193    |
| 0.0669        | 18.89 | 170  | 1.5003          | 0.7357 | 0.5602 | 0.7073 | 0.7231    |
| 0.0632        | 20.0  | 180  | 1.4717          | 0.7247 | 0.5493 | 0.7035 | 0.7163    |
| 0.0609        | 21.11 | 190  | 1.4582          | 0.7271 | 0.5469 | 0.7012 | 0.7136    |
| 0.0599        | 22.22 | 200  | 1.5727          | 0.7365 | 0.5576 | 0.7073 | 0.7233    |
| 0.0587        | 23.33 | 210  | 1.5053          | 0.7419 | 0.5605 | 0.7083 | 0.7265    |
| 0.0532        | 24.44 | 220  | 1.5750          | 0.7372 | 0.5631 | 0.7109 | 0.7235    |
| 0.0529        | 25.56 | 230  | 1.5663          | 0.7356 | 0.5515 | 0.7082 | 0.7234    |
| 0.0519        | 26.67 | 240  | 1.5608          | 0.7403 | 0.5601 | 0.7106 | 0.7258    |
| 0.0502        | 27.78 | 250  | 1.5099          | 0.7314 | 0.5467 | 0.6999 | 0.7183    |
| 0.0562        | 28.89 | 260  | 1.5654          | 0.7317 | 0.5592 | 0.7051 | 0.7194    |
| 0.0486        | 30.0  | 270  | 1.5988          | 0.7309 | 0.5556 | 0.7010 | 0.7171    |
| 0.0451        | 31.11 | 280  | 1.5663          | 0.7301 | 0.5577 | 0.7003 | 0.7177    |
| 0.0425        | 32.22 | 290  | 1.6243          | 0.7281 | 0.5563 | 0.7022 | 0.7160    |
| 0.0436        | 33.33 | 300  | 1.6507          | 0.7253 | 0.5509 | 0.6955 | 0.7134    |
| 0.0419        | 34.44 | 310  | 1.5603          | 0.7334 | 0.5520 | 0.7019 | 0.7195    |
| 0.0428        | 35.56 | 320  | 1.6508          | 0.7282 | 0.5469 | 0.6954 | 0.7153    |
| 0.0409        | 36.67 | 330  | 1.7279          | 0.7220 | 0.5396 | 0.6894 | 0.7118    |
| 0.0406        | 37.78 | 340  | 1.6654          | 0.7324 | 0.5540 | 0.7055 | 0.7217    |
| 0.0402        | 38.89 | 350  | 1.7581          | 0.7210 | 0.5397 | 0.6923 | 0.7106    |
| 0.0405        | 40.0  | 360  | 1.6995          | 0.7250 | 0.5472 | 0.6959 | 0.7153    |
| 0.0393        | 41.11 | 370  | 1.7305          | 0.7234 | 0.5399 | 0.6944 | 0.7138    |
| 0.0387        | 42.22 | 380  | 1.7684          | 0.7177 | 0.5363 | 0.6884 | 0.7082    |
| 0.0396        | 43.33 | 390  | 1.7825          | 0.7208 | 0.5390 | 0.6878 | 0.7095    |
| 0.0391        | 44.44 | 400  | 1.7773          | 0.7222 | 0.5392 | 0.6929 | 0.7124    |
| 0.0386        | 45.56 | 410  | 1.8209          | 0.7200 | 0.5415 | 0.6904 | 0.7086    |
| 0.0383        | 46.67 | 420  | 1.7873          | 0.7210 | 0.5403 | 0.6901 | 0.7093    |
| 0.0387        | 47.78 | 430  | 1.7906          | 0.7186 | 0.5396 | 0.6901 | 0.7095    |
| 0.0385        | 48.89 | 440  | 1.8082          | 0.7224 | 0.5448 | 0.6954 | 0.7137    |
| 0.0392        | 50.0  | 450  | 1.7851          | 0.7309 | 0.5472 | 0.6988 | 0.7188    |
| 0.0386        | 51.11 | 460  | 1.8098          | 0.7201 | 0.5414 | 0.6937 | 0.7100    |
| 0.038         | 52.22 | 470  | 1.8145          | 0.7214 | 0.5413 | 0.6931 | 0.7114    |
| 0.0374        | 53.33 | 480  | 1.7956          | 0.7229 | 0.5408 | 0.6919 | 0.7120    |
| 0.038         | 54.44 | 490  | 1.8609          | 0.7231 | 0.5386 | 0.6876 | 0.7093    |
| 0.0375        | 55.56 | 500  | 1.8295          | 0.7253 | 0.5400 | 0.6924 | 0.7127    |
| 0.0384        | 56.67 | 510  | 1.8193          | 0.7238 | 0.5419 | 0.6958 | 0.7138    |
| 0.0374        | 57.78 | 520  | 1.8510          | 0.7202 | 0.5386 | 0.6890 | 0.7083    |
| 0.0382        | 58.89 | 530  | 1.8385          | 0.7227 | 0.5403 | 0.6888 | 0.7098    |
| 0.0374        | 60.0  | 540  | 1.8390          | 0.7203 | 0.5424 | 0.6895 | 0.7089    |
| 0.0374        | 61.11 | 550  | 1.8651          | 0.7202 | 0.5398 | 0.6902 | 0.7084    |
| 0.0378        | 62.22 | 560  | 1.8618          | 0.7236 | 0.5402 | 0.6882 | 0.7097    |
| 0.0374        | 63.33 | 570  | 1.8483          | 0.7203 | 0.5369 | 0.6905 | 0.7097    |
| 0.0363        | 64.44 | 580  | 1.8637          | 0.7190 | 0.5389 | 0.6897 | 0.7089    |
| 0.0378        | 65.56 | 590  | 1.8953          | 0.7236 | 0.5369 | 0.6882 | 0.7099    |
| 0.0377        | 66.67 | 600  | 1.8834          | 0.7210 | 0.5396 | 0.6909 | 0.7104    |
| 0.037         | 67.78 | 610  | 1.8741          | 0.7210 | 0.5436 | 0.6937 | 0.7117    |
| 0.0367        | 68.89 | 620  | 1.8890          | 0.7214 | 0.5419 | 0.6917 | 0.7097    |
| 0.0384        | 70.0  | 630  | 1.8942          | 0.7238 | 0.5432 | 0.6921 | 0.7115    |
| 0.0368        | 71.11 | 640  | 1.8945          | 0.7250 | 0.5414 | 0.6907 | 0.7116    |
| 0.0369        | 72.22 | 650  | 1.9093          | 0.7235 | 0.5402 | 0.6896 | 0.7094    |
| 0.0374        | 73.33 | 660  | 1.9073          | 0.7221 | 0.5432 | 0.6942 | 0.7093    |
| 0.0368        | 74.44 | 670  | 1.8925          | 0.7202 | 0.5434 | 0.6936 | 0.7097    |
| 0.0374        | 75.56 | 680  | 1.8965          | 0.7187 | 0.5434 | 0.6936 | 0.7084    |
| 0.0369        | 76.67 | 690  | 1.9101          | 0.7200 | 0.5422 | 0.6931 | 0.7078    |
| 0.0369        | 77.78 | 700  | 1.9184          | 0.7186 | 0.5407 | 0.6915 | 0.7074    |
| 0.0368        | 78.89 | 710  | 1.9334          | 0.7218 | 0.5411 | 0.6896 | 0.7078    |
| 0.0366        | 80.0  | 720  | 1.9221          | 0.7227 | 0.5411 | 0.6907 | 0.7090    |
| 0.0364        | 81.11 | 730  | 1.9238          | 0.7227 | 0.5427 | 0.6922 | 0.7090    |
| 0.0369        | 82.22 | 740  | 1.9318          | 0.7198 | 0.5432 | 0.6931 | 0.7068    |
| 0.0364        | 83.33 | 750  | 1.9346          | 0.7210 | 0.5432 | 0.6931 | 0.7083    |
| 0.0377        | 84.44 | 760  | 1.9375          | 0.7212 | 0.5438 | 0.6914 | 0.7070    |
| 0.0358        | 85.56 | 770  | 1.9375          | 0.7217 | 0.5427 | 0.6922 | 0.7076    |
| 0.0363        | 86.67 | 780  | 1.9339          | 0.7206 | 0.5427 | 0.6914 | 0.7065    |
| 0.0376        | 87.78 | 790  | 1.9345          | 0.7206 | 0.5427 | 0.6914 | 0.7065    |
| 0.0363        | 88.89 | 800  | 1.9342          | 0.7198 | 0.5432 | 0.6931 | 0.7068    |
| 0.0361        | 90.0  | 810  | 1.9367          | 0.7186 | 0.5422 | 0.6931 | 0.7063    |
| 0.0363        | 91.11 | 820  | 1.9384          | 0.7198 | 0.5432 | 0.6931 | 0.7068    |
| 0.0366        | 92.22 | 830  | 1.9390          | 0.7186 | 0.5422 | 0.6931 | 0.7063    |
| 0.0369        | 93.33 | 840  | 1.9403          | 0.7206 | 0.5438 | 0.6914 | 0.7070    |
| 0.0358        | 94.44 | 850  | 1.9407          | 0.7212 | 0.5438 | 0.6914 | 0.7070    |
| 0.0354        | 95.56 | 860  | 1.9409          | 0.7212 | 0.5438 | 0.6914 | 0.7070    |
| 0.0369        | 96.67 | 870  | 1.9414          | 0.7212 | 0.5438 | 0.6914 | 0.7070    |
| 0.0361        | 97.78 | 880  | 1.9417          | 0.7212 | 0.5438 | 0.6914 | 0.7070    |
| 0.0365        | 98.89 | 890  | 1.9420          | 0.7212 | 0.5438 | 0.6914 | 0.7070    |
| 0.0364        | 100.0 | 900  | 1.9415          | 0.7212 | 0.5438 | 0.6914 | 0.7070    |


### Framework versions

- Transformers 4.35.2
- Pytorch 2.1.0+cu121
- Datasets 2.16.0
- Tokenizers 0.15.0