dmcooller/neural-matia-ft-2

Files changed (5) hide show

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [TheBloke/Mistral-7B-Instruct-v0.2-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GPTQ) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.6334
 ## Model description
@@ -44,24 +44,22 @@ The following hyperparameters were used during training:
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 2
-- num_epochs: 12
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 3.9006        | 0.89  | 6    | 3.0027          |
-| 2.3509        | 1.93  | 13   | 1.9793          |
-| 1.4873        | 2.96  | 20   | 1.1137          |
-| 0.9306        | 4.0   | 27   | 0.7768          |
-| 0.8433        | 4.89  | 33   | 0.6989          |
-| 0.6696        | 5.93  | 40   | 0.6642          |
-| 0.6475        | 6.96  | 47   | 0.6511          |
-| 0.625         | 8.0   | 54   | 0.6426          |
-| 0.7258        | 8.89  | 60   | 0.6377          |
-| 0.6142        | 9.93  | 67   | 0.6344          |
-| 0.5787        | 10.67 | 72   | 0.6334          |
 ### Framework versions

 This model is a fine-tuned version of [TheBloke/Mistral-7B-Instruct-v0.2-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GPTQ) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.6516
 ## Model description
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 2
+- num_epochs: 10
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 3.9004        | 0.89  | 6    | 3.0020          |
+| 2.3511        | 1.93  | 13   | 1.9686          |
+| 1.4935        | 2.96  | 20   | 1.1307          |
+| 0.9547        | 4.0   | 27   | 0.8176          |
+| 0.8782        | 4.89  | 33   | 0.7137          |
+| 0.6879        | 5.93  | 40   | 0.6794          |
+| 0.6661        | 6.96  | 47   | 0.6609          |
+| 0.6421        | 8.0   | 54   | 0.6534          |
+| 0.6662        | 8.89  | 60   | 0.6516          |
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -16,7 +16,7 @@
   "megatron_core": "megatron.core",
   "modules_to_save": null,
   "peft_type": "LORA",
-  "r": 32,
   "rank_pattern": {},
   "revision": null,
   "target_modules": [

   "megatron_core": "megatron.core",
   "modules_to_save": null,
   "peft_type": "LORA",
+  "r": 30,
   "rank_pattern": {},
   "revision": null,
   "target_modules": [

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:3fd337f744ef51fb7f49b2b43d23c66d1eef1e9b4d67a63b09bdb385ff6eaccd
-size 33563048

 version https://git-lfs.github.com/spec/v1
+oid sha256:f5531423f682609056b5bc113c1cd9474cf885b8f8e17f75d80ab64818f579fa
+size 31465896

runs/Apr03_12-41-31_930e74ea1013/events.out.tfevents.1712148164.930e74ea1013.34.0 ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:35c7b6c2b142c30fc5bf0b14ca84741347e1e7f0c37f48d3ac652cfca38a8899
+size 9841

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:b6f53be7a06501cbdaad5b6be4d93f910d8d3e618869f51bfb741df20d9a42c7
 size 4920

 version https://git-lfs.github.com/spec/v1
+oid sha256:3f841f7415127fcf26208287a90ec73013e658d5051bab3a5a5913bffa1d0d53
 size 4920