error577 commited on
Commit
130b34d
·
verified ·
1 Parent(s): 7be8242

End of training

Browse files
Files changed (2) hide show
  1. README.md +11 -7
  2. adapter_model.bin +2 -2
README.md CHANGED
@@ -62,14 +62,14 @@ lora_alpha: 16
62
  lora_dropout: 0.05
63
  lora_fan_in_fan_out: null
64
  lora_model_dir: null
65
- lora_r: 8
66
  lora_target_linear: true
67
  lr_scheduler: cosine
68
- max_steps: 100
69
  micro_batch_size: 1
70
  mlflow_experiment_name: /tmp/dcd10050e81faec5_train_data.json
71
  model_type: AutoModelForCausalLM
72
- num_epochs: 1
73
  optimizer: adamw_bnb_8bit
74
  output_dir: miner_id_24
75
  pad_to_sequence_len: true
@@ -77,7 +77,7 @@ resume_from_checkpoint: null
77
  s2_attention: null
78
  sample_packing: false
79
  saves_per_epoch: 1
80
- sequence_len: 512
81
  strict: false
82
  tf32: false
83
  tokenizer_type: AutoTokenizer
@@ -102,7 +102,7 @@ xformers_attention: null
102
 
103
  This model is a fine-tuned version of [unsloth/tinyllama](https://huggingface.co/unsloth/tinyllama) on the None dataset.
104
  It achieves the following results on the evaluation set:
105
- - Loss: 2.1172
106
 
107
  ## Model description
108
 
@@ -130,13 +130,17 @@ The following hyperparameters were used during training:
130
  - optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
131
  - lr_scheduler_type: cosine
132
  - lr_scheduler_warmup_steps: 10
133
- - training_steps: 100
134
 
135
  ### Training results
136
 
137
  | Training Loss | Epoch | Step | Validation Loss |
138
  |:-------------:|:------:|:----:|:---------------:|
139
- | 2.1574 | 0.0249 | 100 | 2.1172 |
 
 
 
 
140
 
141
 
142
  ### Framework versions
 
62
  lora_dropout: 0.05
63
  lora_fan_in_fan_out: null
64
  lora_model_dir: null
65
+ lora_r: 32
66
  lora_target_linear: true
67
  lr_scheduler: cosine
68
+ max_steps: 500
69
  micro_batch_size: 1
70
  mlflow_experiment_name: /tmp/dcd10050e81faec5_train_data.json
71
  model_type: AutoModelForCausalLM
72
+ num_epochs: 4
73
  optimizer: adamw_bnb_8bit
74
  output_dir: miner_id_24
75
  pad_to_sequence_len: true
 
77
  s2_attention: null
78
  sample_packing: false
79
  saves_per_epoch: 1
80
+ sequence_len: 1024
81
  strict: false
82
  tf32: false
83
  tokenizer_type: AutoTokenizer
 
102
 
103
  This model is a fine-tuned version of [unsloth/tinyllama](https://huggingface.co/unsloth/tinyllama) on the None dataset.
104
  It achieves the following results on the evaluation set:
105
+ - Loss: 2.0288
106
 
107
  ## Model description
108
 
 
130
  - optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
131
  - lr_scheduler_type: cosine
132
  - lr_scheduler_warmup_steps: 10
133
+ - training_steps: 500
134
 
135
  ### Training results
136
 
137
  | Training Loss | Epoch | Step | Validation Loss |
138
  |:-------------:|:------:|:----:|:---------------:|
139
+ | 2.9282 | 0.0002 | 1 | 3.6463 |
140
+ | 1.7301 | 0.0311 | 125 | 2.0661 |
141
+ | 2.1628 | 0.0622 | 250 | 2.0405 |
142
+ | 2.3675 | 0.0933 | 375 | 2.0310 |
143
+ | 1.958 | 0.1244 | 500 | 2.0288 |
144
 
145
 
146
  ### Framework versions
adapter_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:4b4b23a79a04d4a7a5940cdfe0a6029ffe3af055d536b798ef0c3dc1f81a0385
3
- size 25342042
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:753d8cee6d7ba4c42f760b97f929dd77e9e3298fee2ae3d2071a443ee83652ae
3
+ size 101036698