amharic_text_summarization

This model is a fine-tuned version of google/mt5-small on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5.6e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 32
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 6

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum
No log	1.0	324	2.3183	13.4527	7.2905	13.3087	13.3061
No log	2.0	648	2.1940	13.6905	7.4703	13.5381	13.5183
No log	3.0	972	2.1724	13.8811	7.5513	13.7229	13.7019
11.0153	4.0	1296	2.1444	14.1353	7.7502	13.9441	13.9035
11.0153	5.0	1620	2.1257	14.2967	7.8073	14.0971	14.085
11.0153	6.0	1944	2.1143	14.4092	7.9159	14.1994	14.1897