FatCat87 commited on
Commit
6d13ce0
·
verified ·
1 Parent(s): 2db696d

End of training

Browse files
Files changed (2) hide show
  1. README.md +23 -23
  2. adapter_model.bin +2 -2
README.md CHANGED
@@ -1,12 +1,12 @@
1
  ---
2
- license: llama3
3
  library_name: peft
4
  tags:
5
  - axolotl
6
  - generated_from_trainer
7
- base_model: unsloth/Qwen2.5-3B
8
  model-index:
9
- - name: 080fd5a4-0250-4b62-86a9-7eef387d5b80
10
  results: []
11
  ---
12
 
@@ -19,19 +19,19 @@ should probably proofread and complete it, then remove this comment. -->
19
  axolotl version: `0.4.1`
20
  ```yaml
21
  adapter: lora
22
- base_model: scb10x/llama-3-typhoon-v1.5-8b-instruct
23
  bf16: auto
24
  datasets:
25
  - data_files:
26
- - 8da150510918d7cc_train_data.json
27
  ds_type: json
28
  format: custom
29
- path: 8da150510918d7cc_train_data.json
30
  type:
31
  field: null
32
  field_input: null
33
- field_instruction: paper_title
34
- field_output: paper_abstract
35
  field_system: null
36
  format: null
37
  no_input_format: null
@@ -51,7 +51,7 @@ fsdp_config: null
51
  gradient_accumulation_steps: 4
52
  gradient_checkpointing: true
53
  group_by_length: false
54
- hub_model_id: FatCat87/080fd5a4-0250-4b62-86a9-7eef387d5b80
55
  learning_rate: 0.0002
56
  load_in_4bit: false
57
  load_in_8bit: true
@@ -71,9 +71,10 @@ pad_to_sequence_len: true
71
  resume_from_checkpoint: null
72
  sample_packing: true
73
  saves_per_epoch: 1
74
- seed: 31014
75
  sequence_len: 4096
76
- special_tokens: null
 
77
  strict: false
78
  tf32: false
79
  tokenizer_type: AutoTokenizer
@@ -82,9 +83,9 @@ val_set_size: 0.1
82
  wandb_entity: fatcat87-taopanda
83
  wandb_log_model: null
84
  wandb_mode: online
85
- wandb_name: 080fd5a4-0250-4b62-86a9-7eef387d5b80
86
  wandb_project: subnet56
87
- wandb_runid: 080fd5a4-0250-4b62-86a9-7eef387d5b80
88
  wandb_watch: null
89
  warmup_ratio: 0.05
90
  weight_decay: 0.0
@@ -94,12 +95,12 @@ xformers_attention: null
94
 
95
  </details><br>
96
 
97
- [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/fatcat87-taopanda/subnet56/runs/5fyifub3)
98
- # 080fd5a4-0250-4b62-86a9-7eef387d5b80
99
 
100
- This model is a fine-tuned version of [scb10x/llama-3-typhoon-v1.5-8b-instruct](https://huggingface.co/scb10x/llama-3-typhoon-v1.5-8b-instruct) on the None dataset.
101
  It achieves the following results on the evaluation set:
102
- - Loss: 2.2622
103
 
104
  ## Model description
105
 
@@ -121,7 +122,7 @@ The following hyperparameters were used during training:
121
  - learning_rate: 0.0002
122
  - train_batch_size: 2
123
  - eval_batch_size: 2
124
- - seed: 31014
125
  - distributed_type: multi-GPU
126
  - num_devices: 2
127
  - gradient_accumulation_steps: 4
@@ -129,17 +130,16 @@ The following hyperparameters were used during training:
129
  - total_eval_batch_size: 4
130
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
131
  - lr_scheduler_type: cosine
132
- - lr_scheduler_warmup_steps: 2
133
  - num_epochs: 1
134
 
135
  ### Training results
136
 
137
  | Training Loss | Epoch | Step | Validation Loss |
138
  |:-------------:|:------:|:----:|:---------------:|
139
- | 2.5948 | 0.0215 | 1 | 2.5959 |
140
- | 2.2907 | 0.2581 | 12 | 2.3007 |
141
- | 2.2559 | 0.5161 | 24 | 2.2711 |
142
- | 2.2303 | 0.7742 | 36 | 2.2622 |
143
 
144
 
145
  ### Framework versions
 
1
  ---
2
+ license: apache-2.0
3
  library_name: peft
4
  tags:
5
  - axolotl
6
  - generated_from_trainer
7
+ base_model: EleutherAI/pythia-70m-deduped
8
  model-index:
9
+ - name: 6d89a915-cad3-42ab-8d1a-5e9e9e98151c
10
  results: []
11
  ---
12
 
 
19
  axolotl version: `0.4.1`
20
  ```yaml
21
  adapter: lora
22
+ base_model: EleutherAI/pythia-70m-deduped
23
  bf16: auto
24
  datasets:
25
  - data_files:
26
+ - c0e356afd17a58f1_train_data.json
27
  ds_type: json
28
  format: custom
29
+ path: c0e356afd17a58f1_train_data.json
30
  type:
31
  field: null
32
  field_input: null
33
+ field_instruction: ruby_text
34
+ field_output: text
35
  field_system: null
36
  format: null
37
  no_input_format: null
 
51
  gradient_accumulation_steps: 4
52
  gradient_checkpointing: true
53
  group_by_length: false
54
+ hub_model_id: FatCat87/6d89a915-cad3-42ab-8d1a-5e9e9e98151c
55
  learning_rate: 0.0002
56
  load_in_4bit: false
57
  load_in_8bit: true
 
71
  resume_from_checkpoint: null
72
  sample_packing: true
73
  saves_per_epoch: 1
74
+ seed: 26260
75
  sequence_len: 4096
76
+ special_tokens:
77
+ pad_token: <|endoftext|>
78
  strict: false
79
  tf32: false
80
  tokenizer_type: AutoTokenizer
 
83
  wandb_entity: fatcat87-taopanda
84
  wandb_log_model: null
85
  wandb_mode: online
86
+ wandb_name: 6d89a915-cad3-42ab-8d1a-5e9e9e98151c
87
  wandb_project: subnet56
88
+ wandb_runid: 6d89a915-cad3-42ab-8d1a-5e9e9e98151c
89
  wandb_watch: null
90
  warmup_ratio: 0.05
91
  weight_decay: 0.0
 
95
 
96
  </details><br>
97
 
98
+ [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/fatcat87-taopanda/subnet56/runs/51bg82x5)
99
+ # 6d89a915-cad3-42ab-8d1a-5e9e9e98151c
100
 
101
+ This model is a fine-tuned version of [EleutherAI/pythia-70m-deduped](https://huggingface.co/EleutherAI/pythia-70m-deduped) on the None dataset.
102
  It achieves the following results on the evaluation set:
103
+ - Loss: 32.2953
104
 
105
  ## Model description
106
 
 
122
  - learning_rate: 0.0002
123
  - train_batch_size: 2
124
  - eval_batch_size: 2
125
+ - seed: 26260
126
  - distributed_type: multi-GPU
127
  - num_devices: 2
128
  - gradient_accumulation_steps: 4
 
130
  - total_eval_batch_size: 4
131
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
132
  - lr_scheduler_type: cosine
 
133
  - num_epochs: 1
134
 
135
  ### Training results
136
 
137
  | Training Loss | Epoch | Step | Validation Loss |
138
  |:-------------:|:------:|:----:|:---------------:|
139
+ | 48.9386 | 0.0635 | 1 | 45.6835 |
140
+ | 47.7809 | 0.2540 | 4 | 44.7200 |
141
+ | 34.2772 | 0.5079 | 8 | 38.5010 |
142
+ | 30.6758 | 0.7619 | 12 | 32.2953 |
143
 
144
 
145
  ### Framework versions
adapter_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:8c06a24f56b4e9d832307fa8e5f4e193bff9487f97556b67e442b1844f74995c
3
- size 184419648
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:eb0b4cd4d9927ffd90e5c7c4fe44fcfb21156e57107514aba930818b3f037eb2
3
+ size 6309118