FatCat87 commited on
Commit
4af307a
·
verified ·
1 Parent(s): 1f23c61

End of training

Browse files
Files changed (2) hide show
  1. README.md +21 -21
  2. adapter_model.bin +1 -1
README.md CHANGED
@@ -57,9 +57,9 @@ lora_target_modules:
57
  - query_key_value
58
  micro_batch_size: 4
59
  num_epochs: 4
60
- output_dir: ./outputs/lora-alpaca-pythia
61
  resume_from_checkpoint: null
62
- seed: 77054
63
  sequence_len: 512
64
  special_tokens:
65
  pad_token: <|endoftext|>
@@ -79,12 +79,12 @@ weight_decay: 0.1
79
 
80
  </details><br>
81
 
82
- [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/fatcat87-taopanda/subnet56/runs/qwfaghog)
83
  # taopanda-2_bcc7097d-6c73-48e3-aaee-f9f854afb9b4
84
 
85
  This model is a fine-tuned version of [EleutherAI/pythia-410m-deduped](https://huggingface.co/EleutherAI/pythia-410m-deduped) on the None dataset.
86
  It achieves the following results on the evaluation set:
87
- - Loss: 2.2651
88
 
89
  ## Model description
90
 
@@ -106,7 +106,7 @@ The following hyperparameters were used during training:
106
  - learning_rate: 1e-05
107
  - train_batch_size: 4
108
  - eval_batch_size: 4
109
- - seed: 77054
110
  - distributed_type: multi-GPU
111
  - num_devices: 2
112
  - total_train_batch_size: 8
@@ -120,22 +120,22 @@ The following hyperparameters were used during training:
120
 
121
  | Training Loss | Epoch | Step | Validation Loss |
122
  |:-------------:|:------:|:----:|:---------------:|
123
- | 3.0118 | 0.0006 | 1 | 2.8035 |
124
- | 2.7361 | 0.2505 | 414 | 2.5480 |
125
- | 2.4912 | 0.5009 | 828 | 2.4415 |
126
- | 3.2917 | 0.7514 | 1242 | 2.3872 |
127
- | 2.9411 | 1.0018 | 1656 | 2.3535 |
128
- | 2.6596 | 1.2523 | 2070 | 2.3288 |
129
- | 2.0685 | 1.5027 | 2484 | 2.3111 |
130
- | 2.1828 | 1.7532 | 2898 | 2.2993 |
131
- | 2.1507 | 2.0036 | 3312 | 2.2905 |
132
- | 2.6897 | 2.2541 | 3726 | 2.2807 |
133
- | 2.5161 | 2.5045 | 4140 | 2.2778 |
134
- | 2.5809 | 2.7550 | 4554 | 2.2725 |
135
- | 2.7309 | 3.0054 | 4968 | 2.2687 |
136
- | 2.3226 | 3.2559 | 5382 | 2.2675 |
137
- | 2.7654 | 3.5064 | 5796 | 2.2657 |
138
- | 2.0191 | 3.7568 | 6210 | 2.2651 |
139
 
140
 
141
  ### Framework versions
 
57
  - query_key_value
58
  micro_batch_size: 4
59
  num_epochs: 4
60
+ output_dir: ./outputs/lora-alpaca-pythia/taopanda-2_bcc7097d-6c73-48e3-aaee-f9f854afb9b4
61
  resume_from_checkpoint: null
62
+ seed: 84664
63
  sequence_len: 512
64
  special_tokens:
65
  pad_token: <|endoftext|>
 
79
 
80
  </details><br>
81
 
82
+ [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/fatcat87-taopanda/subnet56/runs/0tu4r381)
83
  # taopanda-2_bcc7097d-6c73-48e3-aaee-f9f854afb9b4
84
 
85
  This model is a fine-tuned version of [EleutherAI/pythia-410m-deduped](https://huggingface.co/EleutherAI/pythia-410m-deduped) on the None dataset.
86
  It achieves the following results on the evaluation set:
87
+ - Loss: 2.2826
88
 
89
  ## Model description
90
 
 
106
  - learning_rate: 1e-05
107
  - train_batch_size: 4
108
  - eval_batch_size: 4
109
+ - seed: 84664
110
  - distributed_type: multi-GPU
111
  - num_devices: 2
112
  - total_train_batch_size: 8
 
120
 
121
  | Training Loss | Epoch | Step | Validation Loss |
122
  |:-------------:|:------:|:----:|:---------------:|
123
+ | 2.7173 | 0.0006 | 1 | 2.8035 |
124
+ | 2.3948 | 0.2505 | 414 | 2.5510 |
125
+ | 2.6987 | 0.5009 | 828 | 2.4507 |
126
+ | 2.2231 | 0.7514 | 1242 | 2.3969 |
127
+ | 2.4612 | 1.0018 | 1656 | 2.3698 |
128
+ | 2.9173 | 1.2523 | 2070 | 2.3450 |
129
+ | 2.3121 | 1.5027 | 2484 | 2.3282 |
130
+ | 2.8931 | 1.7532 | 2898 | 2.3154 |
131
+ | 2.0185 | 2.0036 | 3312 | 2.3080 |
132
+ | 2.2114 | 2.2541 | 3726 | 2.2980 |
133
+ | 2.4148 | 2.5045 | 4140 | 2.2941 |
134
+ | 2.2134 | 2.7550 | 4554 | 2.2887 |
135
+ | 1.5517 | 3.0054 | 4968 | 2.2839 |
136
+ | 2.2136 | 3.2559 | 5382 | 2.2811 |
137
+ | 1.2004 | 3.5064 | 5796 | 2.2838 |
138
+ | 2.374 | 3.7568 | 6210 | 2.2826 |
139
 
140
 
141
  ### Framework versions
adapter_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:0c2f625b7ea28763e089a48b202a0a393167ba29b6f9cd37b16d39d9c6af6715
3
  size 6309438
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0b3c41bd57bc3c59ce309650c5b4eadf02e2e87c16aa5d4290355d83debe88b4
3
  size 6309438