HimashaJ96's picture
Upload model
fe81ac2
|
raw
history blame
9.62 kB
metadata
license: mit
library_name: peft
tags:
  - trl
  - sft
  - generated_from_trainer
metrics:
  - rouge
base_model: TheBloke/zephyr-7B-beta-GPTQ
model-index:
  - name: zephyr-support-chatbot
    results: []

zephyr-support-chatbot

This model is a fine-tuned version of TheBloke/zephyr-7B-beta-GPTQ on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.9415
  • Rouge1: 0.7212
  • Rouge2: 0.5438
  • Rougel: 0.6914
  • Rougelsum: 0.7070

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • num_epochs: 100
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
2.025 1.11 10 1.7161 0.5886 0.2570 0.5121 0.5475
1.1915 2.22 20 1.2651 0.6729 0.4745 0.6406 0.6543
0.9665 3.33 30 1.0929 0.6952 0.4754 0.6673 0.6810
0.7734 4.44 40 0.9885 0.7230 0.5207 0.6976 0.7041
0.6012 5.56 50 0.9900 0.7342 0.5549 0.7101 0.7208
0.5071 6.67 60 0.9744 0.7380 0.5513 0.7143 0.7256
0.3746 7.78 70 1.0405 0.7381 0.5550 0.7137 0.7290
0.275 8.89 80 1.1184 0.7430 0.5602 0.7193 0.7289
0.2043 10.0 90 1.1665 0.7352 0.5511 0.7062 0.7193
0.1503 11.11 100 1.2355 0.7368 0.5586 0.7077 0.7234
0.1163 12.22 110 1.3027 0.7298 0.5502 0.7023 0.7155
0.099 13.33 120 1.3860 0.7274 0.5558 0.7034 0.7144
0.0824 14.44 130 1.4979 0.7298 0.5562 0.7033 0.7172
0.084 15.56 140 1.4817 0.7284 0.5452 0.6991 0.7157
0.0745 16.67 150 1.4627 0.7187 0.5376 0.6940 0.7062
0.0666 17.78 160 1.4370 0.7306 0.5498 0.7004 0.7193
0.0669 18.89 170 1.5003 0.7357 0.5602 0.7073 0.7231
0.0632 20.0 180 1.4717 0.7247 0.5493 0.7035 0.7163
0.0609 21.11 190 1.4582 0.7271 0.5469 0.7012 0.7136
0.0599 22.22 200 1.5727 0.7365 0.5576 0.7073 0.7233
0.0587 23.33 210 1.5053 0.7419 0.5605 0.7083 0.7265
0.0532 24.44 220 1.5750 0.7372 0.5631 0.7109 0.7235
0.0529 25.56 230 1.5663 0.7356 0.5515 0.7082 0.7234
0.0519 26.67 240 1.5608 0.7403 0.5601 0.7106 0.7258
0.0502 27.78 250 1.5099 0.7314 0.5467 0.6999 0.7183
0.0562 28.89 260 1.5654 0.7317 0.5592 0.7051 0.7194
0.0486 30.0 270 1.5988 0.7309 0.5556 0.7010 0.7171
0.0451 31.11 280 1.5663 0.7301 0.5577 0.7003 0.7177
0.0425 32.22 290 1.6243 0.7281 0.5563 0.7022 0.7160
0.0436 33.33 300 1.6507 0.7253 0.5509 0.6955 0.7134
0.0419 34.44 310 1.5603 0.7334 0.5520 0.7019 0.7195
0.0428 35.56 320 1.6508 0.7282 0.5469 0.6954 0.7153
0.0409 36.67 330 1.7279 0.7220 0.5396 0.6894 0.7118
0.0406 37.78 340 1.6654 0.7324 0.5540 0.7055 0.7217
0.0402 38.89 350 1.7581 0.7210 0.5397 0.6923 0.7106
0.0405 40.0 360 1.6995 0.7250 0.5472 0.6959 0.7153
0.0393 41.11 370 1.7305 0.7234 0.5399 0.6944 0.7138
0.0387 42.22 380 1.7684 0.7177 0.5363 0.6884 0.7082
0.0396 43.33 390 1.7825 0.7208 0.5390 0.6878 0.7095
0.0391 44.44 400 1.7773 0.7222 0.5392 0.6929 0.7124
0.0386 45.56 410 1.8209 0.7200 0.5415 0.6904 0.7086
0.0383 46.67 420 1.7873 0.7210 0.5403 0.6901 0.7093
0.0387 47.78 430 1.7906 0.7186 0.5396 0.6901 0.7095
0.0385 48.89 440 1.8082 0.7224 0.5448 0.6954 0.7137
0.0392 50.0 450 1.7851 0.7309 0.5472 0.6988 0.7188
0.0386 51.11 460 1.8098 0.7201 0.5414 0.6937 0.7100
0.038 52.22 470 1.8145 0.7214 0.5413 0.6931 0.7114
0.0374 53.33 480 1.7956 0.7229 0.5408 0.6919 0.7120
0.038 54.44 490 1.8609 0.7231 0.5386 0.6876 0.7093
0.0375 55.56 500 1.8295 0.7253 0.5400 0.6924 0.7127
0.0384 56.67 510 1.8193 0.7238 0.5419 0.6958 0.7138
0.0374 57.78 520 1.8510 0.7202 0.5386 0.6890 0.7083
0.0382 58.89 530 1.8385 0.7227 0.5403 0.6888 0.7098
0.0374 60.0 540 1.8390 0.7203 0.5424 0.6895 0.7089
0.0374 61.11 550 1.8651 0.7202 0.5398 0.6902 0.7084
0.0378 62.22 560 1.8618 0.7236 0.5402 0.6882 0.7097
0.0374 63.33 570 1.8483 0.7203 0.5369 0.6905 0.7097
0.0363 64.44 580 1.8637 0.7190 0.5389 0.6897 0.7089
0.0378 65.56 590 1.8953 0.7236 0.5369 0.6882 0.7099
0.0377 66.67 600 1.8834 0.7210 0.5396 0.6909 0.7104
0.037 67.78 610 1.8741 0.7210 0.5436 0.6937 0.7117
0.0367 68.89 620 1.8890 0.7214 0.5419 0.6917 0.7097
0.0384 70.0 630 1.8942 0.7238 0.5432 0.6921 0.7115
0.0368 71.11 640 1.8945 0.7250 0.5414 0.6907 0.7116
0.0369 72.22 650 1.9093 0.7235 0.5402 0.6896 0.7094
0.0374 73.33 660 1.9073 0.7221 0.5432 0.6942 0.7093
0.0368 74.44 670 1.8925 0.7202 0.5434 0.6936 0.7097
0.0374 75.56 680 1.8965 0.7187 0.5434 0.6936 0.7084
0.0369 76.67 690 1.9101 0.7200 0.5422 0.6931 0.7078
0.0369 77.78 700 1.9184 0.7186 0.5407 0.6915 0.7074
0.0368 78.89 710 1.9334 0.7218 0.5411 0.6896 0.7078
0.0366 80.0 720 1.9221 0.7227 0.5411 0.6907 0.7090
0.0364 81.11 730 1.9238 0.7227 0.5427 0.6922 0.7090
0.0369 82.22 740 1.9318 0.7198 0.5432 0.6931 0.7068
0.0364 83.33 750 1.9346 0.7210 0.5432 0.6931 0.7083
0.0377 84.44 760 1.9375 0.7212 0.5438 0.6914 0.7070
0.0358 85.56 770 1.9375 0.7217 0.5427 0.6922 0.7076
0.0363 86.67 780 1.9339 0.7206 0.5427 0.6914 0.7065
0.0376 87.78 790 1.9345 0.7206 0.5427 0.6914 0.7065
0.0363 88.89 800 1.9342 0.7198 0.5432 0.6931 0.7068
0.0361 90.0 810 1.9367 0.7186 0.5422 0.6931 0.7063
0.0363 91.11 820 1.9384 0.7198 0.5432 0.6931 0.7068
0.0366 92.22 830 1.9390 0.7186 0.5422 0.6931 0.7063
0.0369 93.33 840 1.9403 0.7206 0.5438 0.6914 0.7070
0.0358 94.44 850 1.9407 0.7212 0.5438 0.6914 0.7070
0.0354 95.56 860 1.9409 0.7212 0.5438 0.6914 0.7070
0.0369 96.67 870 1.9414 0.7212 0.5438 0.6914 0.7070
0.0361 97.78 880 1.9417 0.7212 0.5438 0.6914 0.7070
0.0365 98.89 890 1.9420 0.7212 0.5438 0.6914 0.7070
0.0364 100.0 900 1.9415 0.7212 0.5438 0.6914 0.7070

Framework versions

  • PEFT 0.7.1
  • Transformers 4.35.2
  • Pytorch 2.1.0+cu121
  • Datasets 2.16.0
  • Tokenizers 0.15.0