pretrain_gemma_base_epoch_3 / generation_config.json
Mykes's picture
pretrain qwen_base 3 epoch. Loss 0.636
4aa100c verified
raw
history blame contribute delete
190 Bytes
{
"_from_model_config": true,
"bos_token_id": 2,
"cache_implementation": "hybrid",
"eos_token_id": 1,
"max_length": 8192,
"pad_token_id": 0,
"transformers_version": "4.49.0"
}