## model checkpoints warmup 2%, lr_peak: 10% 13000_steps.pth 26000_steps.pth 39000_steps.pth