0df5ca9e-02a5-4973-a1ba-cb76e52979a4

This model is a fine-tuned version of HuggingFaceH4/tiny-random-LlamaForCausalLM on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.000201
train_batch_size: 4
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 8
optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 50
training_steps: 500

Training Loss	Epoch	Step	Validation Loss
No log	0.0002	1	10.3744
10.3462	0.0082	50	10.3596
10.3334	0.0164	100	10.3472
10.3313	0.0245	150	10.3442
10.3314	0.0327	200	10.3415
10.332	0.0409	250	10.3401
10.3325	0.0491	300	10.3391
10.3273	0.0572	350	10.3388
10.3253	0.0654	400	10.3385
10.3213	0.0736	450	10.3385
10.3227	0.0818	500	10.3384