Ah7med commited on
Commit
4de5bf4
·
verified ·
1 Parent(s): 0843b7f

Training complete

Browse files
README.md ADDED
@@ -0,0 +1,87 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ license: apache-2.0
4
+ base_model: google/mt5-small
5
+ tags:
6
+ - summarization
7
+ - generated_from_trainer
8
+ datasets:
9
+ - samsum
10
+ metrics:
11
+ - rouge
12
+ model-index:
13
+ - name: mt5-small-finetuned
14
+ results:
15
+ - task:
16
+ name: Sequence-to-sequence Language Modeling
17
+ type: text2text-generation
18
+ dataset:
19
+ name: samsum
20
+ type: samsum
21
+ config: samsum
22
+ split: validation
23
+ args: samsum
24
+ metrics:
25
+ - name: Rouge1
26
+ type: rouge
27
+ value: 0.4303256962227823
28
+ ---
29
+
30
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
31
+ should probably proofread and complete it, then remove this comment. -->
32
+
33
+ # mt5-small-finetuned
34
+
35
+ This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on the samsum dataset.
36
+ It achieves the following results on the evaluation set:
37
+ - Loss: 1.7974
38
+ - Rouge1: 0.4303
39
+ - Rouge2: 0.2038
40
+ - Rougel: 0.3736
41
+ - Rougelsum: 0.3734
42
+
43
+ ## Model description
44
+
45
+ More information needed
46
+
47
+ ## Intended uses & limitations
48
+
49
+ More information needed
50
+
51
+ ## Training and evaluation data
52
+
53
+ More information needed
54
+
55
+ ## Training procedure
56
+
57
+ ### Training hyperparameters
58
+
59
+ The following hyperparameters were used during training:
60
+ - learning_rate: 5.6e-05
61
+ - train_batch_size: 8
62
+ - eval_batch_size: 8
63
+ - seed: 42
64
+ - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
65
+ - lr_scheduler_type: linear
66
+ - num_epochs: 8
67
+
68
+ ### Training results
69
+
70
+ | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
71
+ |:-------------:|:-----:|:-----:|:---------------:|:------:|:------:|:------:|:---------:|
72
+ | 2.1585 | 1.0 | 1842 | 1.9205 | 0.4074 | 0.1838 | 0.3517 | 0.3518 |
73
+ | 2.1545 | 2.0 | 3684 | 1.8882 | 0.4120 | 0.1914 | 0.3592 | 0.3588 |
74
+ | 2.0888 | 3.0 | 5526 | 1.8290 | 0.4196 | 0.1939 | 0.3603 | 0.3601 |
75
+ | 2.0272 | 4.0 | 7368 | 1.8269 | 0.4215 | 0.1975 | 0.3637 | 0.3635 |
76
+ | 1.9871 | 5.0 | 9210 | 1.8224 | 0.4231 | 0.1943 | 0.3634 | 0.3633 |
77
+ | 1.9535 | 6.0 | 11052 | 1.8055 | 0.4285 | 0.2030 | 0.3715 | 0.3715 |
78
+ | 1.9322 | 7.0 | 12894 | 1.7954 | 0.4270 | 0.2018 | 0.3698 | 0.3697 |
79
+ | 1.9181 | 8.0 | 14736 | 1.7974 | 0.4303 | 0.2038 | 0.3736 | 0.3734 |
80
+
81
+
82
+ ### Framework versions
83
+
84
+ - Transformers 4.47.0
85
+ - Pytorch 2.5.1+cu121
86
+ - Datasets 3.2.0
87
+ - Tokenizers 0.21.0
generation_config.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "decoder_start_token_id": 0,
3
+ "eos_token_id": 1,
4
+ "pad_token_id": 0,
5
+ "transformers_version": "4.47.0"
6
+ }
runs/Feb12_23-49-16_3582313c7fa4/events.out.tfevents.1739404182.3582313c7fa4.31.3 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:11baff5faf4a68325cde11017cc3800b1a42974393afa7f9650fd04abf00688e
3
- size 10340
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:464658fe73c0083b93411f103cee38bfad9b049355f9d1b78977d649b0527a16
3
+ size 11168
runs/Feb12_23-49-16_3582313c7fa4/events.out.tfevents.1739408239.3582313c7fa4.31.4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6374b5d81084499909ba40c938e1389e3da64e232788e0cbba6fc5087bda6beb
3
+ size 562