metadata

license: apache-2.0
base_model: facebook/bart-large
tags:
  - generated_from_trainer
metrics:
  - rouge
  - precision
  - recall
  - f1
model-index:
  - name: LLM_Teached_Bart_From_Scratch
    results: []

LLM_Teached_Bart_From_Scratch

This model is a fine-tuned version of facebook/bart-large on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.6053
Rouge1: 0.4481
Rouge2: 0.2283
Rougel: 0.3861
Rougelsum: 0.3863
Gen Len: 19.9029
Precision: 0.9159
Recall: 0.8916
F1: 0.9034

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 24
eval_batch_size: 16
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 96
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 24
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	F1	Gen Len	Validation Loss	Precision	Recall	Rouge1	Rouge2	Rougel	Rougelsum
1.836	1.0	521	0.8971	19.9745	1.5560	0.9105	0.8843	0.4155	0.2028	0.3561	0.3559
1.5951	2.0	1042	0.8997	19.9353	1.5004	0.9115	0.8886	0.4333	0.2136	0.3695	0.3694
1.469	3.0	1563	0.9001	19.9385	1.4691	0.912	0.8888	0.4355	0.2176	0.3729	0.3728
1.373	4.0	2084	0.9003	19.9647	1.4658	0.9137	0.8877	0.4311	0.2164	0.3706	0.3704
1.2902	5.0	2605	0.9008	19.9498	1.4542	0.9136	0.8887	0.4368	0.2218	0.3762	0.376
1.222	6.0	3126	0.9018	19.9425	1.4584	0.914	0.8902	0.4407	0.223	0.3802	0.3798
1.1655	7.0	3647	0.9019	19.9327	1.4709	0.9145	0.89	0.4404	0.2246	0.3806	0.3803
1.11	8.0	4168	0.9026	19.9084	1.4724	0.9153	0.8906	0.4435	0.2269	0.383	0.3828
1.0629	9.0	4689	0.9028	19.928	1.4853	0.9155	0.8908	0.4431	0.2273	0.3832	0.383
1.023	10.0	5210	0.9021	19.944	1.5033	0.9152	0.8897	0.4409	0.2247	0.3819	0.3818
0.9862	11.0	5731	0.9034	19.9124	1.5074	0.9158	0.8916	0.4479	0.2278	0.3862	0.386
0.957	12.0	6252	0.903	19.9033	1.5184	0.9159	0.8909	0.4461	0.2264	0.3846	0.3847
0.9315	13.0	6773	0.9031	19.9084	1.5269	0.9156	0.8912	0.4473	0.2284	0.386	0.3858
0.9093	14.0	7294	0.9029	19.9135	1.5311	0.9155	0.8909	0.4453	0.2273	0.3846	0.3843
0.8927	15.0	7815	0.9029	19.9065	1.5351	0.9156	0.8909	0.4457	0.2267	0.3842	0.384
0.8773	16.0	8336	0.9025	19.9425	1.5440	0.9151	0.8905	0.4427	0.225	0.382	0.382
0.8806	17.0	8857	0.9036	19.8851	1.5510	0.9159	0.8919	0.4495	0.2279	0.3868	0.3869
0.8683	18.0	9378	1.5679	0.4473	0.2282	0.3856	0.3857	19.8829	0.9161	0.8921	0.9038
0.8413	19.0	9899	1.5745	0.4492	0.2282	0.3861	0.3864	19.9135	0.9159	0.8918	0.9035
0.8257	20.0	10420	1.5835	0.4471	0.2266	0.3852	0.3853	19.8996	0.9153	0.8915	0.9031
0.8097	21.0	10941	1.5957	0.4472	0.2271	0.3856	0.3856	19.9073	0.9156	0.8919	0.9034
0.7926	22.0	11462	1.5956	0.4479	0.2282	0.3855	0.3857	19.892	0.9159	0.8916	0.9034
0.7841	23.0	11983	1.5990	0.4444	0.2261	0.3833	0.3834	19.912	0.9155	0.8908	0.9028
0.7669	24.0	12504	1.6053	0.4481	0.2283	0.3861	0.3863	19.9029	0.9159	0.8916	0.9034

Framework versions

Transformers 4.36.0
Pytorch 2.0.1+cu117
Datasets 2.14.5
Tokenizers 0.15.0