---
license: mit
base_model: TheBloke/zephyr-7B-beta-GPTQ
tags:
- trl
- sft
- generated_from_trainer
metrics:
- rouge
model-index:
- name: zephyr-support-chatbot
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# zephyr-support-chatbot

This model is a fine-tuned version of [TheBloke/zephyr-7B-beta-GPTQ](https://huggingface.co/TheBloke/zephyr-7B-beta-GPTQ) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 1.2805
- Rouge1: 0.6842
- Rouge2: 0.4855
- Rougel: 0.6563
- Rougelsum: 0.6711

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- num_epochs: 20
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
|:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|
| 2.422         | 1.11  | 10   | 2.7640          | 0.4291 | 0.1054 | 0.3461 | 0.3890    |
| 2.2454        | 2.22  | 20   | 2.5777          | 0.4423 | 0.1184 | 0.3607 | 0.4034    |
| 2.1454        | 3.33  | 30   | 2.3809          | 0.4713 | 0.1437 | 0.3860 | 0.4288    |
| 1.9437        | 4.44  | 40   | 2.1804          | 0.5021 | 0.1646 | 0.4027 | 0.4598    |
| 1.7975        | 5.56  | 50   | 2.0124          | 0.5355 | 0.1786 | 0.4425 | 0.4941    |
| 1.6621        | 6.67  | 60   | 1.8249          | 0.5540 | 0.2188 | 0.5011 | 0.5348    |
| 1.5141        | 7.78  | 70   | 1.6004          | 0.6161 | 0.3377 | 0.5701 | 0.5961    |
| 1.3291        | 8.89  | 80   | 1.4718          | 0.6513 | 0.3903 | 0.6072 | 0.6322    |
| 1.2206        | 10.0  | 90   | 1.3916          | 0.6652 | 0.4218 | 0.6265 | 0.6471    |
| 1.1767        | 11.11 | 100  | 1.3339          | 0.6840 | 0.4769 | 0.6489 | 0.6675    |
| 1.1462        | 12.22 | 110  | 1.3115          | 0.6807 | 0.4785 | 0.6506 | 0.6665    |
| 1.0924        | 13.33 | 120  | 1.2993          | 0.6843 | 0.4842 | 0.6539 | 0.6701    |
| 1.0602        | 14.44 | 130  | 1.2917          | 0.6854 | 0.4845 | 0.6561 | 0.6717    |
| 1.1177        | 15.56 | 140  | 1.2863          | 0.6835 | 0.4842 | 0.6547 | 0.6703    |
| 1.0756        | 16.67 | 150  | 1.2830          | 0.6838 | 0.4825 | 0.6549 | 0.6705    |
| 1.0894        | 17.78 | 160  | 1.2813          | 0.6838 | 0.4844 | 0.6560 | 0.6719    |
| 1.0649        | 18.89 | 170  | 1.2806          | 0.6842 | 0.4855 | 0.6563 | 0.6711    |
| 1.1019        | 20.0  | 180  | 1.2805          | 0.6842 | 0.4855 | 0.6563 | 0.6711    |


### Framework versions

- Transformers 4.35.2
- Pytorch 2.1.0+cu121
- Datasets 2.16.0
- Tokenizers 0.15.0