ldos's picture
End of training
2af8334
metadata
license: mit
base_model: facebook/bart-large-xsum
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: text_shortening_model_v34
    results: []

text_shortening_model_v34

This model is a fine-tuned version of facebook/bart-large-xsum on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 2.7697
  • Rouge1: 0.4731
  • Rouge2: 0.253
  • Rougel: 0.4166
  • Rougelsum: 0.416
  • Bert precision: 0.8697
  • Bert recall: 0.8697
  • Average word count: 8.7087
  • Max word count: 17
  • Min word count: 5
  • Average token count: 16.3093
  • % shortened texts with length > 12: 6.6066

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Bert precision Bert recall Average word count Max word count Min word count Average token count % shortened texts with length > 12
2.4675 1.0 19 3.1777 0.4029 0.1769 0.3503 0.3498 0.8509 0.857 9.6577 17 5 15.4324 10.2102
1.1669 2.0 38 1.9224 0.4506 0.2396 0.4184 0.4181 0.864 0.8688 8.6306 15 5 14.2613 4.2042
0.9292 3.0 57 1.7461 0.4654 0.2556 0.4186 0.419 0.8654 0.8722 9.0751 17 5 14.9099 4.2042
0.7876 4.0 76 1.9057 0.4003 0.207 0.367 0.366 0.8539 0.8516 8.1021 13 5 16.2883 1.2012
0.5976 5.0 95 1.7603 0.4776 0.2636 0.4254 0.4248 0.8659 0.8754 9.1952 16 5 15.0961 6.006
0.469 6.0 114 2.1107 0.4675 0.2542 0.4077 0.4081 0.856 0.8776 11.1802 20 5 18.4505 31.5315
0.4291 7.0 133 1.7980 0.4701 0.2509 0.4202 0.4195 0.8647 0.8723 9.1832 15 5 14.7267 6.3063
0.3673 8.0 152 1.9170 0.4669 0.2574 0.4188 0.4187 0.8678 0.8698 8.6306 18 5 14.3093 3.9039
0.3432 9.0 171 2.0268 0.4804 0.2691 0.4254 0.4249 0.8682 0.8753 9.2402 18 5 14.6847 9.3093
0.3094 10.0 190 2.1107 0.4809 0.2724 0.4353 0.4337 0.8689 0.8739 9.2883 17 4 16.2162 9.009
0.4402 11.0 209 2.2507 0.4816 0.268 0.428 0.4278 0.8668 0.8743 9.4805 18 4 16.6126 10.8108
0.3691 12.0 228 2.1652 0.4784 0.2637 0.4286 0.4277 0.8683 0.8714 8.7988 15 5 14.5105 6.006
0.1853 13.0 247 2.3660 0.4705 0.259 0.4119 0.4115 0.8686 0.8695 8.7898 17 5 16.2432 6.6066
0.3186 14.0 266 2.3237 0.4817 0.27 0.4273 0.4271 0.8698 0.8738 8.973 17 5 16.5976 9.3093
0.1745 15.0 285 2.2675 0.4672 0.2577 0.4177 0.4165 0.8698 0.8694 8.6066 16 5 14.7117 3.9039
0.1304 16.0 304 2.5157 0.4726 0.253 0.418 0.4167 0.8691 0.8688 8.6517 17 4 15.8468 3.9039
0.1432 17.0 323 2.4798 0.4744 0.2614 0.4204 0.4196 0.869 0.8725 8.9189 17 5 15.5015 6.006
0.1116 18.0 342 2.5924 0.4772 0.2589 0.4222 0.4221 0.87 0.8717 8.7508 17 5 15.6096 6.9069
0.0921 19.0 361 2.6547 0.4733 0.2541 0.4205 0.4199 0.8694 0.8694 8.6787 16 5 15.4204 6.006
0.0679 20.0 380 2.7697 0.4731 0.253 0.4166 0.416 0.8697 0.8697 8.7087 17 5 16.3093 6.6066

Framework versions

  • Transformers 4.33.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.14.5
  • Tokenizers 0.13.3