Venkatesh4342 commited on
Commit
211baba
1 Parent(s): ce354c1

Training complete!

Browse files
Files changed (3) hide show
  1. README.md +92 -0
  2. generation_config.json +13 -0
  3. pytorch_model.bin +1 -1
README.md ADDED
@@ -0,0 +1,92 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ base_model: facebook/bart-large-xsum
4
+ tags:
5
+ - generated_from_trainer
6
+ datasets:
7
+ - samsum
8
+ metrics:
9
+ - rouge
10
+ model-index:
11
+ - name: bart-samsum
12
+ results:
13
+ - task:
14
+ name: Sequence-to-sequence Language Modeling
15
+ type: text2text-generation
16
+ dataset:
17
+ name: samsum
18
+ type: samsum
19
+ config: samsum
20
+ split: validation
21
+ args: samsum
22
+ metrics:
23
+ - name: Rouge1
24
+ type: rouge
25
+ value: 0.547
26
+ ---
27
+
28
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
29
+ should probably proofread and complete it, then remove this comment. -->
30
+
31
+ # bart-samsum
32
+
33
+ This model is a fine-tuned version of [facebook/bart-large-xsum](https://huggingface.co/facebook/bart-large-xsum) on the samsum dataset.
34
+ It achieves the following results on the evaluation set:
35
+ - Loss: 1.3852
36
+ - Rouge1: 0.547
37
+ - Rouge2: 0.2837
38
+ - Rougel: 0.4462
39
+ - Rougelsum: 0.4454
40
+ - Gen Len: 29.72
41
+
42
+ ## Model description
43
+
44
+ More information needed
45
+
46
+ ## Intended uses & limitations
47
+
48
+ More information needed
49
+
50
+ ## Training and evaluation data
51
+
52
+ More information needed
53
+
54
+ ## Training procedure
55
+
56
+ ### Training hyperparameters
57
+
58
+ The following hyperparameters were used during training:
59
+ - learning_rate: 2e-05
60
+ - train_batch_size: 4
61
+ - eval_batch_size: 4
62
+ - seed: 42
63
+ - gradient_accumulation_steps: 2
64
+ - total_train_batch_size: 8
65
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
66
+ - lr_scheduler_type: linear
67
+ - lr_scheduler_warmup_steps: 500
68
+ - num_epochs: 3
69
+
70
+ ### Training results
71
+
72
+ | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
73
+ |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|
74
+ | 1.5201 | 0.27 | 500 | 1.4589 | 0.5276 | 0.2694 | 0.4246 | 0.424 | 33.5067 |
75
+ | 1.3757 | 0.54 | 1000 | 1.5105 | 0.506 | 0.2566 | 0.415 | 0.4146 | 29.76 |
76
+ | 1.3496 | 0.81 | 1500 | 1.4039 | 0.5365 | 0.2759 | 0.4233 | 0.4221 | 29.8 |
77
+ | 1.094 | 1.09 | 2000 | 1.4119 | 0.5407 | 0.2827 | 0.4293 | 0.4288 | 29.84 |
78
+ | 1.1488 | 1.36 | 2500 | 1.3680 | 0.5275 | 0.2637 | 0.423 | 0.4224 | 26.92 |
79
+ | 1.1222 | 1.63 | 3000 | 1.2875 | 0.5369 | 0.2844 | 0.4473 | 0.4463 | 29.2267 |
80
+ | 1.1092 | 1.9 | 3500 | 1.3968 | 0.533 | 0.2818 | 0.4354 | 0.4363 | 30.0667 |
81
+ | 0.8509 | 2.17 | 4000 | 1.3682 | 0.5306 | 0.2874 | 0.4327 | 0.4331 | 29.1467 |
82
+ | 0.9565 | 2.44 | 4500 | 1.3450 | 0.5466 | 0.2782 | 0.4419 | 0.4409 | 29.2133 |
83
+ | 0.8496 | 2.72 | 5000 | 1.3768 | 0.5366 | 0.2807 | 0.4359 | 0.4351 | 30.7733 |
84
+ | 0.8397 | 2.99 | 5500 | 1.3852 | 0.547 | 0.2837 | 0.4462 | 0.4454 | 29.72 |
85
+
86
+
87
+ ### Framework versions
88
+
89
+ - Transformers 4.33.1
90
+ - Pytorch 2.0.1+cu118
91
+ - Datasets 2.14.5
92
+ - Tokenizers 0.13.3
generation_config.json ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token_id": 0,
3
+ "decoder_start_token_id": 2,
4
+ "early_stopping": true,
5
+ "eos_token_id": 2,
6
+ "forced_eos_token_id": 2,
7
+ "max_length": 62,
8
+ "min_length": 11,
9
+ "no_repeat_ngram_size": 3,
10
+ "num_beams": 6,
11
+ "pad_token_id": 1,
12
+ "transformers_version": "4.33.1"
13
+ }
pytorch_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:c5d1fb21f4868664656626e9065eaf211fcc8bc2fbc8cdbc6de313c95b2c33f9
3
  size 1625537293
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f72c5d91c1084b65cd607073356e052606d3f9d08b91267d566b9cd24f59c779
3
  size 1625537293