Update README.md
Browse files
README.md
CHANGED
@@ -25,7 +25,7 @@ license: cc-by-nc-4.0
|
|
25 |
|
26 |
### Pre-training
|
27 |
|
28 |
-
Pre-training took
|
29 |
|
30 |
| Params | Global batch size\* | Initial learning rate | Train iter.\* | Max length\* | Weight decay |
|
31 |
| -- | -- | -- | -- | -- | -- |
|
|
|
25 |
|
26 |
### Pre-training
|
27 |
|
28 |
+
Pre-training took about 49K GPU hours (NVIDIA A100). Related settings are listed below.
|
29 |
|
30 |
| Params | Global batch size\* | Initial learning rate | Train iter.\* | Max length\* | Weight decay |
|
31 |
| -- | -- | -- | -- | -- | -- |
|