End of training
Browse files- README.md +21 -21
- adapter_model.bin +1 -1
README.md
CHANGED
@@ -57,9 +57,9 @@ lora_target_modules:
|
|
57 |
- query_key_value
|
58 |
micro_batch_size: 4
|
59 |
num_epochs: 4
|
60 |
-
output_dir: ./outputs/lora-alpaca-pythia
|
61 |
resume_from_checkpoint: null
|
62 |
-
seed:
|
63 |
sequence_len: 512
|
64 |
special_tokens:
|
65 |
pad_token: <|endoftext|>
|
@@ -79,12 +79,12 @@ weight_decay: 0.1
|
|
79 |
|
80 |
</details><br>
|
81 |
|
82 |
-
[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/fatcat87-taopanda/subnet56/runs/
|
83 |
# taopanda-2_bcc7097d-6c73-48e3-aaee-f9f854afb9b4
|
84 |
|
85 |
This model is a fine-tuned version of [EleutherAI/pythia-410m-deduped](https://huggingface.co/EleutherAI/pythia-410m-deduped) on the None dataset.
|
86 |
It achieves the following results on the evaluation set:
|
87 |
-
- Loss: 2.
|
88 |
|
89 |
## Model description
|
90 |
|
@@ -106,7 +106,7 @@ The following hyperparameters were used during training:
|
|
106 |
- learning_rate: 1e-05
|
107 |
- train_batch_size: 4
|
108 |
- eval_batch_size: 4
|
109 |
-
- seed:
|
110 |
- distributed_type: multi-GPU
|
111 |
- num_devices: 2
|
112 |
- total_train_batch_size: 8
|
@@ -120,22 +120,22 @@ The following hyperparameters were used during training:
|
|
120 |
|
121 |
| Training Loss | Epoch | Step | Validation Loss |
|
122 |
|:-------------:|:------:|:----:|:---------------:|
|
123 |
-
|
|
124 |
-
| 2.
|
125 |
-
| 2.
|
126 |
-
|
|
127 |
-
| 2.
|
128 |
-
| 2.
|
129 |
-
| 2.
|
130 |
-
| 2.
|
131 |
-
| 2.
|
132 |
-
| 2.
|
133 |
-
| 2.
|
134 |
-
| 2.
|
135 |
-
|
|
136 |
-
| 2.
|
137 |
-
|
|
138 |
-
| 2.
|
139 |
|
140 |
|
141 |
### Framework versions
|
|
|
57 |
- query_key_value
|
58 |
micro_batch_size: 4
|
59 |
num_epochs: 4
|
60 |
+
output_dir: ./outputs/lora-alpaca-pythia/taopanda-2_bcc7097d-6c73-48e3-aaee-f9f854afb9b4
|
61 |
resume_from_checkpoint: null
|
62 |
+
seed: 84664
|
63 |
sequence_len: 512
|
64 |
special_tokens:
|
65 |
pad_token: <|endoftext|>
|
|
|
79 |
|
80 |
</details><br>
|
81 |
|
82 |
+
[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/fatcat87-taopanda/subnet56/runs/0tu4r381)
|
83 |
# taopanda-2_bcc7097d-6c73-48e3-aaee-f9f854afb9b4
|
84 |
|
85 |
This model is a fine-tuned version of [EleutherAI/pythia-410m-deduped](https://huggingface.co/EleutherAI/pythia-410m-deduped) on the None dataset.
|
86 |
It achieves the following results on the evaluation set:
|
87 |
+
- Loss: 2.2826
|
88 |
|
89 |
## Model description
|
90 |
|
|
|
106 |
- learning_rate: 1e-05
|
107 |
- train_batch_size: 4
|
108 |
- eval_batch_size: 4
|
109 |
+
- seed: 84664
|
110 |
- distributed_type: multi-GPU
|
111 |
- num_devices: 2
|
112 |
- total_train_batch_size: 8
|
|
|
120 |
|
121 |
| Training Loss | Epoch | Step | Validation Loss |
|
122 |
|:-------------:|:------:|:----:|:---------------:|
|
123 |
+
| 2.7173 | 0.0006 | 1 | 2.8035 |
|
124 |
+
| 2.3948 | 0.2505 | 414 | 2.5510 |
|
125 |
+
| 2.6987 | 0.5009 | 828 | 2.4507 |
|
126 |
+
| 2.2231 | 0.7514 | 1242 | 2.3969 |
|
127 |
+
| 2.4612 | 1.0018 | 1656 | 2.3698 |
|
128 |
+
| 2.9173 | 1.2523 | 2070 | 2.3450 |
|
129 |
+
| 2.3121 | 1.5027 | 2484 | 2.3282 |
|
130 |
+
| 2.8931 | 1.7532 | 2898 | 2.3154 |
|
131 |
+
| 2.0185 | 2.0036 | 3312 | 2.3080 |
|
132 |
+
| 2.2114 | 2.2541 | 3726 | 2.2980 |
|
133 |
+
| 2.4148 | 2.5045 | 4140 | 2.2941 |
|
134 |
+
| 2.2134 | 2.7550 | 4554 | 2.2887 |
|
135 |
+
| 1.5517 | 3.0054 | 4968 | 2.2839 |
|
136 |
+
| 2.2136 | 3.2559 | 5382 | 2.2811 |
|
137 |
+
| 1.2004 | 3.5064 | 5796 | 2.2838 |
|
138 |
+
| 2.374 | 3.7568 | 6210 | 2.2826 |
|
139 |
|
140 |
|
141 |
### Framework versions
|
adapter_model.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 6309438
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:0b3c41bd57bc3c59ce309650c5b4eadf02e2e87c16aa5d4290355d83debe88b4
|
3 |
size 6309438
|