upd run_3/readme.md
Browse files- run_3/readme.md +4 -0
run_3/readme.md
CHANGED
@@ -17,6 +17,10 @@ Checkpoint used: checkpoint-12000
|
|
17 |
## Advices
|
18 |
* I guess, we need to use warmup when resuming training and increasing LR compared to the last LR in previous run
|
19 |
* need to set number of steps > 6000. because model improved WER veeery slowly
|
|
|
|
|
|
|
|
|
20 |
* can use original Mozilla Common Voice dataset instead of a HuggingFace's one.<br>
|
21 |
the reason is that original contains multiple voicings of same sentence -
|
22 |
so there is at least twice as more data.<br>
|
|
|
17 |
## Advices
|
18 |
* I guess, we need to use warmup when resuming training and increasing LR compared to the last LR in previous run
|
19 |
* need to set number of steps > 6000. because model improved WER veeery slowly
|
20 |
+
* probably need to load `optimizer.pt` and `scaler.pt` from checkpoint before resuming training.
|
21 |
+
otherwise, I guess, we
|
22 |
+
* reinitialize optimizer and loose history of parameters momentum (exponential weighted average)
|
23 |
+
* scale loss incorrectly
|
24 |
* can use original Mozilla Common Voice dataset instead of a HuggingFace's one.<br>
|
25 |
the reason is that original contains multiple voicings of same sentence -
|
26 |
so there is at least twice as more data.<br>
|