dmcooller/neural-matia-phi-ft

Browse files

Files changed (5) hide show

README.md +16 -20
adapter_config.json +1 -1
adapter_model.safetensors +2 -2
runs/Apr04_09-13-27_41c2082af0da/events.out.tfevents.1712222058.41c2082af0da.34.0 +3 -0
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -1,9 +1,9 @@
 ---
-license: apache-2.0
 library_name: peft
 tags:
 - generated_from_trainer
-base_model: TheBloke/Mistral-7B-Instruct-v0.2-GPTQ
 model-index:
 - name: neural-matia-ft
   results: []
@@ -14,9 +14,9 @@ should probably proofread and complete it, then remove this comment. -->
 # neural-matia-ft
-This model is a fine-tuned version of [TheBloke/Mistral-7B-Instruct-v0.2-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GPTQ) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.8002
 ## Model description
@@ -36,33 +36,29 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 0.0002
-- train_batch_size: 5
-- eval_batch_size: 5
 - seed: 42
 - gradient_accumulation_steps: 4
-- total_train_batch_size: 20
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 2
-- num_epochs: 12
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 3.8166        | 0.95  | 5    | 3.3249          |
-| 2.9744        | 1.9   | 10   | 2.5352          |
-| 2.2051        | 2.86  | 15   | 1.9065          |
-| 1.288         | 4.0   | 21   | 1.2826          |
-| 1.0574        | 4.95  | 26   | 1.0073          |
-| 0.8263        | 5.9   | 31   | 0.8886          |
-| 0.7487        | 6.86  | 36   | 0.8408          |
-| 0.5904        | 8.0   | 42   | 0.8178          |
-| 0.6909        | 8.95  | 47   | 0.8095          |
-| 0.681         | 9.9   | 52   | 0.8039          |
-| 0.6745        | 10.86 | 57   | 0.8008          |
-| 0.4618        | 11.43 | 60   | 0.8002          |
 ### Framework versions

 ---
+license: mit
 library_name: peft
 tags:
 - generated_from_trainer
+base_model: microsoft/phi-2
 model-index:
 - name: neural-matia-ft
   results: []
 # neural-matia-ft
+This model is a fine-tuned version of [microsoft/phi-2](https://huggingface.co/microsoft/phi-2) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: nan
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 0.0002
+- train_batch_size: 8
+- eval_batch_size: 8
 - seed: 42
 - gradient_accumulation_steps: 4
+- total_train_batch_size: 32
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 2
+- num_epochs: 10
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 0.0           | 0.8   | 3    | nan             |
+| 0.0           | 1.87  | 7    | nan             |
+| 0.0           | 2.93  | 11   | nan             |
+| 0.0           | 4.0   | 15   | nan             |
+| 0.0           | 4.8   | 18   | nan             |
+| 0.0           | 5.87  | 22   | nan             |
+| 0.0           | 6.93  | 26   | nan             |
+| 0.0           | 8.0   | 30   | nan             |
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -16,7 +16,7 @@
   "megatron_core": "megatron.core",
   "modules_to_save": null,
   "peft_type": "LORA",
-  "r": 6,
   "rank_pattern": {},
   "revision": null,
   "target_modules": [

   "megatron_core": "megatron.core",
   "modules_to_save": null,
   "peft_type": "LORA",
+  "r": 32,
   "rank_pattern": {},
   "revision": null,
   "target_modules": [

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:2797f1998e7683a76a05fb207bc0782ec4ec7c6237fbfef8c87db22eaae82793
-size 6300984

 version https://git-lfs.github.com/spec/v1
+oid sha256:77673cfc5048e5bbfb2e5c2ce61e1420d8c60d054687bf0292da233c0632565f
+size 20981200

runs/Apr04_09-13-27_41c2082af0da/events.out.tfevents.1712222058.41c2082af0da.34.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c94e402e1b71b6ec7d8db789fd9844614670aa0fd51ac5033e1bc959b9a62e2a
+size 8966

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:f6bd2dd89f06aa97e70272158e4640129f45f3820f5a53689c2250752d2b4131
 size 4920

 version https://git-lfs.github.com/spec/v1
+oid sha256:cb1459850c04205e33af9a4fcca8c5e857333c34944fcbb9b307b96ea74c1836
 size 4920