End of training
610ccf6
verified
-
attn_norm=layernorm_teacher_only, attn_projector=mlp, attn_weight=5, learning_rate=0.0002, per_device_train_batch_size=16, warmup_ratio=0
Training in progress, step 61875
-
attn_norm=layernorm_teacher_only, attn_projector=orthogonal, attn_weight=5, learning_rate=0.0002, per_device_train_batch_size=16, warmup_ratio=0
Training in progress, step 61875
-
attn_norm=layernorm_teacher_only_affine, attn_projector=mlp, attn_weight=5, learning_rate=0.0002, per_device_train_batch_size=16, warmup_ratio=0
End of training
-
attn_norm=layernorm_teacher_only_affine, attn_projector=orthogonal, attn_weight=5, learning_rate=0.0002, per_device_train_batch_size=16, warmup_ratio=0
End of training