junnyu/DeepScaleR-1.5B-Preview-Reproduce · question about training updates

Hello, thanks for the release of more training details of the deepscaler! However, I have a small question: Why does your checkpoint of the first stage show that it was trained for 560 steps rather than 1040 steps, as indicated by the author (https://github.com/agentica-project/deepscaler#:~:text=At%20step%201040%20and%201520%2C%20the%20context%20length%20is%20extended%20to%2016K%20and%2024K.)?
I am also reproducing the results of deepscaler, and have trained 680 steps (the ckpt dir indicating global_step_680) in the first stage. The training procedure is still going on. Moreover, no training log indicates how many updates have been trained. I wonder if I have to manually stop the procedure and when I should do.