404sau404 commited on
Commit
7084848
1 Parent(s): c7c8d8e

End of training

Browse files
Files changed (4) hide show
  1. README.md +13 -12
  2. config.json +1 -1
  3. model.safetensors +1 -1
  4. training_args.bin +1 -1
README.md CHANGED
@@ -17,12 +17,12 @@ should probably proofread and complete it, then remove this comment. -->
17
 
18
  This model is a fine-tuned version of [facebook/bart-large-xsum](https://huggingface.co/facebook/bart-large-xsum) on the None dataset.
19
  It achieves the following results on the evaluation set:
20
- - Loss: 1.4704
21
- - Rouge1: 54.8232
22
- - Rouge2: 30.1114
23
- - Rougel: 45.2666
24
- - Rougelsum: 50.7533
25
- - Gen Len: 30.3399
26
 
27
  ## Model description
28
 
@@ -48,23 +48,24 @@ The following hyperparameters were used during training:
48
  - gradient_accumulation_steps: 2
49
  - total_train_batch_size: 8
50
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
51
- - lr_scheduler_type: linear
52
  - num_epochs: 4
53
  - mixed_precision_training: Native AMP
 
54
 
55
  ### Training results
56
 
57
  | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
58
  |:-------------:|:------:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:|
59
- | 1.3807 | 0.9997 | 1841 | 1.5203 | 52.4158 | 27.5034 | 42.8274 | 48.0361 | 31.4664 |
60
- | 1.077 | 2.0 | 3683 | 1.5038 | 53.5277 | 28.5946 | 44.2315 | 49.5696 | 30.768 |
61
- | 0.831 | 2.9997 | 5524 | 1.5362 | 52.9008 | 27.7041 | 43.5637 | 48.3921 | 29.9243 |
62
- | 0.6919 | 3.9989 | 7364 | 1.6272 | 52.8716 | 27.9183 | 43.8019 | 48.6547 | 30.2002 |
63
 
64
 
65
  ### Framework versions
66
 
67
  - Transformers 4.42.4
68
- - Pytorch 2.3.1+cu121
69
  - Datasets 2.21.0
70
  - Tokenizers 0.19.1
 
17
 
18
  This model is a fine-tuned version of [facebook/bart-large-xsum](https://huggingface.co/facebook/bart-large-xsum) on the None dataset.
19
  It achieves the following results on the evaluation set:
20
+ - Loss: 2.6994
21
+ - Rouge1: 54.5529
22
+ - Rouge2: 30.0179
23
+ - Rougel: 45.3837
24
+ - Rougelsum: 50.4176
25
+ - Gen Len: 28.967
26
 
27
  ## Model description
28
 
 
48
  - gradient_accumulation_steps: 2
49
  - total_train_batch_size: 8
50
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
51
+ - lr_scheduler_type: cosine
52
  - num_epochs: 4
53
  - mixed_precision_training: Native AMP
54
+ - label_smoothing_factor: 0.1
55
 
56
  ### Training results
57
 
58
  | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
59
  |:-------------:|:------:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:|
60
+ | 2.7327 | 0.9997 | 1841 | 2.7677 | 52.2923 | 27.6237 | 43.1558 | 48.08 | 30.4005 |
61
+ | 2.4597 | 2.0 | 3683 | 2.7286 | 53.4085 | 28.7235 | 44.5737 | 49.3042 | 29.3004 |
62
+ | 2.2042 | 2.9997 | 5524 | 2.7436 | 53.6036 | 28.857 | 44.7337 | 49.2789 | 28.4188 |
63
+ | 2.1096 | 3.9989 | 7364 | 2.7886 | 53.0547 | 28.3597 | 44.0648 | 48.804 | 29.5165 |
64
 
65
 
66
  ### Framework versions
67
 
68
  - Transformers 4.42.4
69
+ - Pytorch 2.4.0+cu121
70
  - Datasets 2.21.0
71
  - Tokenizers 0.19.1
config.json CHANGED
@@ -18,7 +18,7 @@
18
  "decoder_layerdrop": 0.0,
19
  "decoder_layers": 12,
20
  "decoder_start_token_id": 2,
21
- "dropout": 0.1,
22
  "early_stopping": true,
23
  "encoder_attention_heads": 16,
24
  "encoder_ffn_dim": 4096,
 
18
  "decoder_layerdrop": 0.0,
19
  "decoder_layers": 12,
20
  "decoder_start_token_id": 2,
21
+ "dropout": 0.3,
22
  "early_stopping": true,
23
  "encoder_attention_heads": 16,
24
  "encoder_ffn_dim": 4096,
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:a996079c7ea898ab67590499a7df4e3063639d63680d6767449e01343d481964
3
  size 1625422896
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1373397ebd3433e2102a0623450b03c0dab0bca21a2d5932be3da57c3998a39f
3
  size 1625422896
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d31a2b64498f4d64053ce41274b1367fc759775c8d0085c7d91c9f6f60f391d6
3
  size 5240
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ac2e3051452b3e645f50ca82e57da79685dfaaae5f9e819072eab4874d3a0f17
3
  size 5240