GlycerinLOL's picture
Model save
a7630b1 verified
|
raw
history blame
4.85 kB
metadata
license: apache-2.0
base_model: facebook/bart-large
tags:
  - generated_from_trainer
metrics:
  - rouge
  - precision
  - recall
  - f1
model-index:
  - name: LLM_Teached_Bart_From_Scratch
    results: []

LLM_Teached_Bart_From_Scratch

This model is a fine-tuned version of facebook/bart-large on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.6053
  • Rouge1: 0.4481
  • Rouge2: 0.2283
  • Rougel: 0.3861
  • Rougelsum: 0.3863
  • Gen Len: 19.9029
  • Precision: 0.9159
  • Recall: 0.8916
  • F1: 0.9034

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 24
  • eval_batch_size: 16
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 96
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 24
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step F1 Gen Len Validation Loss Precision Recall Rouge1 Rouge2 Rougel Rougelsum
1.836 1.0 521 0.8971 19.9745 1.5560 0.9105 0.8843 0.4155 0.2028 0.3561 0.3559
1.5951 2.0 1042 0.8997 19.9353 1.5004 0.9115 0.8886 0.4333 0.2136 0.3695 0.3694
1.469 3.0 1563 0.9001 19.9385 1.4691 0.912 0.8888 0.4355 0.2176 0.3729 0.3728
1.373 4.0 2084 0.9003 19.9647 1.4658 0.9137 0.8877 0.4311 0.2164 0.3706 0.3704
1.2902 5.0 2605 0.9008 19.9498 1.4542 0.9136 0.8887 0.4368 0.2218 0.3762 0.376
1.222 6.0 3126 0.9018 19.9425 1.4584 0.914 0.8902 0.4407 0.223 0.3802 0.3798
1.1655 7.0 3647 0.9019 19.9327 1.4709 0.9145 0.89 0.4404 0.2246 0.3806 0.3803
1.11 8.0 4168 0.9026 19.9084 1.4724 0.9153 0.8906 0.4435 0.2269 0.383 0.3828
1.0629 9.0 4689 0.9028 19.928 1.4853 0.9155 0.8908 0.4431 0.2273 0.3832 0.383
1.023 10.0 5210 0.9021 19.944 1.5033 0.9152 0.8897 0.4409 0.2247 0.3819 0.3818
0.9862 11.0 5731 0.9034 19.9124 1.5074 0.9158 0.8916 0.4479 0.2278 0.3862 0.386
0.957 12.0 6252 0.903 19.9033 1.5184 0.9159 0.8909 0.4461 0.2264 0.3846 0.3847
0.9315 13.0 6773 0.9031 19.9084 1.5269 0.9156 0.8912 0.4473 0.2284 0.386 0.3858
0.9093 14.0 7294 0.9029 19.9135 1.5311 0.9155 0.8909 0.4453 0.2273 0.3846 0.3843
0.8927 15.0 7815 0.9029 19.9065 1.5351 0.9156 0.8909 0.4457 0.2267 0.3842 0.384
0.8773 16.0 8336 0.9025 19.9425 1.5440 0.9151 0.8905 0.4427 0.225 0.382 0.382
0.8806 17.0 8857 0.9036 19.8851 1.5510 0.9159 0.8919 0.4495 0.2279 0.3868 0.3869
0.8683 18.0 9378 1.5679 0.4473 0.2282 0.3856 0.3857 19.8829 0.9161 0.8921 0.9038
0.8413 19.0 9899 1.5745 0.4492 0.2282 0.3861 0.3864 19.9135 0.9159 0.8918 0.9035
0.8257 20.0 10420 1.5835 0.4471 0.2266 0.3852 0.3853 19.8996 0.9153 0.8915 0.9031
0.8097 21.0 10941 1.5957 0.4472 0.2271 0.3856 0.3856 19.9073 0.9156 0.8919 0.9034
0.7926 22.0 11462 1.5956 0.4479 0.2282 0.3855 0.3857 19.892 0.9159 0.8916 0.9034
0.7841 23.0 11983 1.5990 0.4444 0.2261 0.3833 0.3834 19.912 0.9155 0.8908 0.9028
0.7669 24.0 12504 1.6053 0.4481 0.2283 0.3861 0.3863 19.9029 0.9159 0.8916 0.9034

Framework versions

  • Transformers 4.36.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.5
  • Tokenizers 0.15.0