Model save

Files changed (5) hide show

README.md CHANGED Viewed

@@ -1,11 +1,10 @@
 ---
 base_model: mistralai/Mistral-7B-v0.1
 datasets:
-- Dataset-t-t-t-t-t-t-t-t/UltraMed
 library_name: peft
 license: apache-2.0
 tags:
-- alignment-handbook
 - trl
 - sft
 - generated_from_trainer
@@ -19,9 +18,9 @@ should probably proofread and complete it, then remove this comment. -->
 # zephyr-7b-sft-qlora
-This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the Dataset-t-t-t-t-t-t-t-t/UltraMed dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.2864
 ## Model description
@@ -56,6 +55,9 @@ The following hyperparameters were used during training:
 ### Training results
 ### Framework versions

 ---
 base_model: mistralai/Mistral-7B-v0.1
 datasets:
+- generator
 library_name: peft
 license: apache-2.0
 tags:
 - trl
 - sft
 - generated_from_trainer
 # zephyr-7b-sft-qlora
+This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the generator dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.1659
 ## Model description
 ### Training results
+| Training Loss | Epoch | Step | Validation Loss |
+|:-------------:|:-----:|:----:|:---------------:|
+| 0.4852        | 1.0   | 134  | 0.1659          |
 ### Framework versions

all_results.json CHANGED Viewed

@@ -1,14 +1,9 @@
 {
-    "epoch": 0.265343793262575,
-    "eval_loss": 0.2863554060459137,
-    "eval_runtime": 9.1878,
-    "eval_samples": 500,
-    "eval_samples_per_second": 9.578,
-    "eval_steps_per_second": 0.327,
-    "total_flos": 3.2343958172802744e+18,
-    "train_loss": 0.0,
-    "train_runtime": 0.0427,
-    "train_samples": 2672,
-    "train_samples_per_second": 9970.556,
-    "train_steps_per_second": 304.266
 }

 {
+    "epoch": 1.0,
+    "total_flos": 3.768774247149732e+17,
+    "train_loss": 0.5603564016854585,
+    "train_runtime": 1424.9095,
+    "train_samples": 26722,
+    "train_samples_per_second": 3.001,
+    "train_steps_per_second": 0.094
 }

runs/Oct07_20-06-10_dilara/events.out.tfevents.1728331599.dilara.984420.0 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:4b87eeb38a792ffdf955f4d5e86e3315be90bfb9d40736ef4dc17b808292c677
-size 12150

 version https://git-lfs.github.com/spec/v1
+oid sha256:2791b045cbec7ed35c0ccb16099b81fd35d77e999c1d089e06615a55d3cbf033
+size 12775

train_results.json CHANGED Viewed

@@ -1,9 +1,9 @@
 {
-    "epoch": 0.265343793262575,
-    "total_flos": 3.2343958172802744e+18,
-    "train_loss": 0.0,
-    "train_runtime": 0.0427,
-    "train_samples": 2672,
-    "train_samples_per_second": 9970.556,
-    "train_steps_per_second": 304.266
 }

 {
+    "epoch": 1.0,
+    "total_flos": 3.768774247149732e+17,
+    "train_loss": 0.5603564016854585,
+    "train_runtime": 1424.9095,
+    "train_samples": 26722,
+    "train_samples_per_second": 3.001,
+    "train_steps_per_second": 0.094
 }

trainer_state.json CHANGED Viewed

The diff for this file is too large to render. See raw diff