Saving weights and log at step 800000
Browse files- README.md +1 -1
- flax_model.msgpack +1 -1
- opt_state.msgpack +1 -1
- pytorch_model.bin +1 -1
- training_state.json +1 -1
README.md
CHANGED
@@ -27,7 +27,7 @@ Tokenizer:
|
|
27 |
Training details:
|
28 |
|
29 |
* Training started on step 360K (bs 16) ppl 21 of earlier model trained with Adam optimizer.
|
30 |
-
* Training at step
|
31 |
* Block size: 512
|
32 |
* Optimizer: adafactor
|
33 |
* Learning rate: 3.3e-5
|
|
|
27 |
Training details:
|
28 |
|
29 |
* Training started on step 360K (bs 16) ppl 21 of earlier model trained with Adam optimizer.
|
30 |
+
* Training at step 800K of 2M (38%) ppl 15,3[D
|
31 |
* Block size: 512
|
32 |
* Optimizer: adafactor
|
33 |
* Learning rate: 3.3e-5
|
flax_model.msgpack
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 3096134690
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:154b7b3ce133740107ea7a047d2925f56384b902ca711d54565edbed32eaecf0
|
3 |
size 3096134690
|
opt_state.msgpack
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 5611008
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:4c2f2374b237f8ede160a1917f986b87c9c4c5f42a8bb0cecd5e629fadb70611
|
3 |
size 5611008
|
pytorch_model.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 3134045897
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:8b83d532814eda9eb5e2d22190b579ed962180d7876ae12912d2646cc5a7ea8d
|
3 |
size 3134045897
|
training_state.json
CHANGED
@@ -1 +1 @@
|
|
1 |
-
{"step":
|
|
|
1 |
+
{"step": 800001}
|