timewanderer/shawgpt-ft-model4

Browse files

Files changed (2) hide show

README.md +72 -0
runs/Oct15_16-50-28_a0f238e8381b/events.out.tfevents.1729011030.a0f238e8381b.1517.1 +2 -2

README.md ADDED Viewed

	@@ -0,0 +1,72 @@

+---
+base_model: TheBloke/Mistral-7B-Instruct-v0.2-GPTQ
+library_name: peft
+license: apache-2.0
+tags:
+- generated_from_trainer
+model-index:
+- name: shawgpt-ft-model4
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# shawgpt-ft-model4
+This model is a fine-tuned version of [TheBloke/Mistral-7B-Instruct-v0.2-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GPTQ) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 2.8009
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.0002
+- train_batch_size: 4
+- eval_batch_size: 4
+- seed: 42
+- gradient_accumulation_steps: 4
+- total_train_batch_size: 16
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 5
+- num_epochs: 10
+- mixed_precision_training: Native AMP
+### Training results
+| Training Loss | Epoch  | Step | Validation Loss |
+|:-------------:|:------:|:----:|:---------------:|
+| 4.6424        | 0.9231 | 3    | 4.2062          |
+| 4.5277        | 1.8462 | 6    | 4.0429          |
+| 4.2535        | 2.7692 | 9    | 3.7825          |
+| 2.9545        | 4.0    | 13   | 3.4616          |
+| 3.6837        | 4.9231 | 16   | 3.2524          |
+| 3.4407        | 5.8462 | 19   | 3.0824          |
+| 3.2698        | 6.7692 | 22   | 2.9541          |
+| 2.3491        | 8.0    | 26   | 2.8439          |
+| 3.0525        | 8.9231 | 29   | 2.8052          |
+| 2.0563        | 9.2308 | 30   | 2.8009          |
+### Framework versions
+- PEFT 0.13.2
+- Transformers 4.44.2
+- Pytorch 2.4.1+cu121
+- Datasets 3.0.1
+- Tokenizers 0.19.1

runs/Oct15_16-50-28_a0f238e8381b/events.out.tfevents.1729011030.a0f238e8381b.1517.1 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:077d9097c56b554f8be5e3626ddf1de08ed92551b119b1a9cc9392ed650dab45
-size 10055

 version https://git-lfs.github.com/spec/v1
+oid sha256:ea10430bccb8dffbf57002c304b88abf7541aed7ba8988f031fe7cdd199cf0f8
+size 10669