metadata

license: mit
base_model: facebook/bart-large-xsum
tags:
  - generated_from_trainer
metrics:
  - rouge
  - bleu
model-index:
  - name: bart_samsum
    results: []

bart_samsum

This model is a fine-tuned version of facebook/bart-large-xsum on the None dataset. It achieves the following results on the evaluation set:

Loss: 1.4947
Rouge1: 53.3294
Rouge2: 28.6009
Rougel: 44.2008
Rougelsum: 49.2031
Bleu: 0.0
Meteor: 0.4887
Gen Len: 30.1209

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 8
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 5
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Meteor	Gen Len
1.3838	0.9997	1841	1.5631	52.3252	27.2646	42.5893	48.2397	0.4825	32.0415
1.0835	2.0	3683	1.4947	53.3294	28.6009	44.2008	49.2031	0.4887	30.1209
0.8345	2.9997	5524	1.5956	52.1812	27.1239	42.9864	47.6384	0.4774	30.5446
0.672	4.0	7366	1.6695	52.8148	27.4815	43.3732	48.4633	0.4836	31.0342
0.538	4.9986	9205	1.8055	52.0988	26.762	42.5505	47.3721	0.4738	29.8901

Framework versions

Transformers 4.40.0
Pytorch 2.2.1+cu121
Datasets 2.19.0
Tokenizers 0.19.1