dmcooller commited on
Commit
9170f03
1 Parent(s): ae52d03

dmcooller/neural-matia-ft-2

Browse files
README.md CHANGED
@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
16
 
17
  This model is a fine-tuned version of [TheBloke/Mistral-7B-Instruct-v0.2-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GPTQ) on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
- - Loss: 0.6334
20
 
21
  ## Model description
22
 
@@ -44,24 +44,22 @@ The following hyperparameters were used during training:
44
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
45
  - lr_scheduler_type: linear
46
  - lr_scheduler_warmup_steps: 2
47
- - num_epochs: 12
48
  - mixed_precision_training: Native AMP
49
 
50
  ### Training results
51
 
52
  | Training Loss | Epoch | Step | Validation Loss |
53
  |:-------------:|:-----:|:----:|:---------------:|
54
- | 3.9006 | 0.89 | 6 | 3.0027 |
55
- | 2.3509 | 1.93 | 13 | 1.9793 |
56
- | 1.4873 | 2.96 | 20 | 1.1137 |
57
- | 0.9306 | 4.0 | 27 | 0.7768 |
58
- | 0.8433 | 4.89 | 33 | 0.6989 |
59
- | 0.6696 | 5.93 | 40 | 0.6642 |
60
- | 0.6475 | 6.96 | 47 | 0.6511 |
61
- | 0.625 | 8.0 | 54 | 0.6426 |
62
- | 0.7258 | 8.89 | 60 | 0.6377 |
63
- | 0.6142 | 9.93 | 67 | 0.6344 |
64
- | 0.5787 | 10.67 | 72 | 0.6334 |
65
 
66
 
67
  ### Framework versions
 
16
 
17
  This model is a fine-tuned version of [TheBloke/Mistral-7B-Instruct-v0.2-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GPTQ) on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
+ - Loss: 0.6516
20
 
21
  ## Model description
22
 
 
44
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
45
  - lr_scheduler_type: linear
46
  - lr_scheduler_warmup_steps: 2
47
+ - num_epochs: 10
48
  - mixed_precision_training: Native AMP
49
 
50
  ### Training results
51
 
52
  | Training Loss | Epoch | Step | Validation Loss |
53
  |:-------------:|:-----:|:----:|:---------------:|
54
+ | 3.9004 | 0.89 | 6 | 3.0020 |
55
+ | 2.3511 | 1.93 | 13 | 1.9686 |
56
+ | 1.4935 | 2.96 | 20 | 1.1307 |
57
+ | 0.9547 | 4.0 | 27 | 0.8176 |
58
+ | 0.8782 | 4.89 | 33 | 0.7137 |
59
+ | 0.6879 | 5.93 | 40 | 0.6794 |
60
+ | 0.6661 | 6.96 | 47 | 0.6609 |
61
+ | 0.6421 | 8.0 | 54 | 0.6534 |
62
+ | 0.6662 | 8.89 | 60 | 0.6516 |
 
 
63
 
64
 
65
  ### Framework versions
adapter_config.json CHANGED
@@ -16,7 +16,7 @@
16
  "megatron_core": "megatron.core",
17
  "modules_to_save": null,
18
  "peft_type": "LORA",
19
- "r": 32,
20
  "rank_pattern": {},
21
  "revision": null,
22
  "target_modules": [
 
16
  "megatron_core": "megatron.core",
17
  "modules_to_save": null,
18
  "peft_type": "LORA",
19
+ "r": 30,
20
  "rank_pattern": {},
21
  "revision": null,
22
  "target_modules": [
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:3fd337f744ef51fb7f49b2b43d23c66d1eef1e9b4d67a63b09bdb385ff6eaccd
3
- size 33563048
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f5531423f682609056b5bc113c1cd9474cf885b8f8e17f75d80ab64818f579fa
3
+ size 31465896
runs/Apr03_12-41-31_930e74ea1013/events.out.tfevents.1712148164.930e74ea1013.34.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:35c7b6c2b142c30fc5bf0b14ca84741347e1e7f0c37f48d3ac652cfca38a8899
3
+ size 9841
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b6f53be7a06501cbdaad5b6be4d93f910d8d3e618869f51bfb741df20d9a42c7
3
  size 4920
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3f841f7415127fcf26208287a90ec73013e658d5051bab3a5a5913bffa1d0d53
3
  size 4920