Mayilde/shawgpt-ft

Browse files

Files changed (3) hide show

README.md +27 -13
runs/Oct21_11-55-14_455caaabfdfd/events.out.tfevents.1729511721.455caaabfdfd.2464.0 +3 -0
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [TheBloke/Mistral-7B-Instruct-v0.2-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GPTQ) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.3485
 ## Model description
@@ -42,26 +42,40 @@ The following hyperparameters were used during training:
 - gradient_accumulation_steps: 4
 - total_train_batch_size: 16
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
-- lr_scheduler_type: cosine
 - lr_scheduler_warmup_steps: 2
-- num_epochs: 5
 - mixed_precision_training: Native AMP
 ### Training results
-| Training Loss | Epoch  | Step | Validation Loss |
-|:-------------:|:------:|:----:|:---------------:|
-| 1.1194        | 0.9231 | 3    | 1.3438          |
-| 1.1165        | 1.8462 | 6    | 1.3420          |
-| 1.0754        | 2.7692 | 9    | 1.3242          |
-| 0.7662        | 4.0    | 13   | 1.3482          |
-| 0.9215        | 4.6154 | 15   | 1.3485          |
 ### Framework versions
-- PEFT 0.13.0
-- Transformers 4.45.1
 - Pytorch 2.1.0+cu121
 - Datasets 3.0.1
-- Tokenizers 0.20.0

 This model is a fine-tuned version of [TheBloke/Mistral-7B-Instruct-v0.2-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GPTQ) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.4930
 ## Model description
 - gradient_accumulation_steps: 4
 - total_train_batch_size: 16
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 2
+- num_epochs: 20
 - mixed_precision_training: Native AMP
 ### Training results
+| Training Loss | Epoch   | Step | Validation Loss |
+|:-------------:|:-------:|:----:|:---------------:|
+| 4.5948        | 0.9231  | 3    | 3.9703          |
+| 4.0516        | 1.8462  | 6    | 3.4377          |
+| 3.4456        | 2.7692  | 9    | 2.9505          |
+| 2.2132        | 4.0     | 13   | 2.4876          |
+| 2.5771        | 4.9231  | 16   | 2.2085          |
+| 2.1985        | 5.8462  | 19   | 1.9655          |
+| 1.9692        | 6.7692  | 22   | 1.8320          |
+| 1.3913        | 8.0     | 26   | 1.7772          |
+| 1.7898        | 8.9231  | 29   | 1.6993          |
+| 1.7076        | 9.8462  | 32   | 1.6772          |
+| 1.7068        | 10.7692 | 35   | 1.6532          |
+| 1.2376        | 12.0    | 39   | 1.6191          |
+| 1.64          | 12.9231 | 42   | 1.5902          |
+| 1.5758        | 13.8462 | 45   | 1.5654          |
+| 1.576         | 14.7692 | 48   | 1.5560          |
+| 1.1826        | 16.0    | 52   | 1.5495          |
+| 1.5448        | 16.9231 | 55   | 1.5249          |
+| 1.499         | 17.8462 | 58   | 1.5051          |
+| 1.0455        | 18.4615 | 60   | 1.4930          |
 ### Framework versions
+- PEFT 0.13.2
+- Transformers 4.45.2
 - Pytorch 2.1.0+cu121
 - Datasets 3.0.1
+- Tokenizers 0.20.1

runs/Oct21_11-55-14_455caaabfdfd/events.out.tfevents.1729511721.455caaabfdfd.2464.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d898e8cd6766c10652d8bf62728347752f454dbeb63e3e235b3a2ff97e2fa7fe
+size 5611

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:46331811dd064a064e99e86ac81f5ec6b7e87a5ba66de6c4f0b303193b6be495
 size 5176

 version https://git-lfs.github.com/spec/v1
+oid sha256:d6ce4cb54eea19b49f001029c65fc3a426270971c6d6d421b13ec1f596f96b52
 size 5176