emilstabil
/

DanSumT5-baseV_38821

Text2Text Generation

Transformers

PyTorch

mt5

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

emilstabil commited on Nov 19, 2023

Commit

f04aa61

1 Parent(s): 28a4bcd

End of training

Browse files

Files changed (2) hide show

README.md +34 -29
pytorch_model.bin +1 -1

README.md CHANGED Viewed

@@ -6,23 +6,23 @@ tags:
 metrics:
 - rouge
 model-index:
-- name: DanSumT5-base-finetuned-test_6887
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
-# DanSumT5-base-finetuned-test_6887
 This model is a fine-tuned version of [Danish-summarisation/DanSumT5-base](https://huggingface.co/Danish-summarisation/DanSumT5-base) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 2.5277
-- Rouge1: 31.3188
-- Rouge2: 7.8236
-- Rougel: 17.8296
-- Rougelsum: 28.6162
-- Gen Len: 127.0
 ## Model description
@@ -42,34 +42,39 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 2e-05
-- train_batch_size: 3
-- eval_batch_size: 3
 - seed: 42
 - gradient_accumulation_steps: 4
-- total_train_batch_size: 12
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- num_epochs: 15
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss | Rouge1  | Rouge2 | Rougel  | Rougelsum | Gen Len |
-|:-------------:|:-----:|:----:|:---------------:|:-------:|:------:|:-------:|:---------:|:-------:|
-| No log        | 0.99  | 66   | 2.7401          | 29.9221 | 6.3861 | 16.5877 | 27.0611   | 125.09  |
-| No log        | 1.99  | 133  | 2.6815          | 30.6874 | 6.9413 | 16.9609 | 27.8851   | 125.98  |
-| No log        | 3.0   | 200  | 2.6440          | 31.2045 | 7.4012 | 17.7421 | 28.3497   | 126.63  |
-| No log        | 4.0   | 267  | 2.6199          | 31.3329 | 7.4574 | 17.8549 | 28.643    | 126.98  |
-| No log        | 4.99  | 333  | 2.5984          | 31.5184 | 7.7763 | 17.9153 | 29.0627   | 127.0   |
-| No log        | 5.99  | 400  | 2.5822          | 31.8839 | 7.9755 | 18.0572 | 29.2282   | 126.65  |
-| No log        | 7.0   | 467  | 2.5677          | 31.5939 | 7.9515 | 17.865  | 29.2019   | 126.45  |
-| 2.8684        | 8.0   | 534  | 2.5587          | 31.4931 | 7.6042 | 17.6853 | 28.8366   | 126.79  |
-| 2.8684        | 8.99  | 600  | 2.5496          | 31.105  | 7.6714 | 17.5128 | 28.5242   | 126.78  |
-| 2.8684        | 9.99  | 667  | 2.5423          | 31.6087 | 8.0358 | 17.9956 | 28.9514   | 126.78  |
-| 2.8684        | 11.0  | 734  | 2.5364          | 31.411  | 7.9534 | 17.895  | 28.7595   | 127.0   |
-| 2.8684        | 12.0  | 801  | 2.5326          | 31.4648 | 7.9777 | 17.9589 | 28.8168   | 127.0   |
-| 2.8684        | 12.99 | 867  | 2.5296          | 31.374  | 7.8341 | 17.8341 | 28.8146   | 127.0   |
-| 2.8684        | 13.99 | 934  | 2.5278          | 31.2822 | 7.7789 | 17.7983 | 28.5903   | 127.0   |
-| 2.8684        | 14.83 | 990  | 2.5277          | 31.3188 | 7.8236 | 17.8296 | 28.6162   | 127.0   |
 ### Framework versions

 metrics:
 - rouge
 model-index:
+- name: DanSumT5-baseV_38821
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
+# DanSumT5-baseV_38821
 This model is a fine-tuned version of [Danish-summarisation/DanSumT5-base](https://huggingface.co/Danish-summarisation/DanSumT5-base) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 2.2026
+- Rouge1: 34.9358
+- Rouge2: 11.6813
+- Rougel: 21.4935
+- Rougelsum: 27.4979
+- Gen Len: 126.3262
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 2e-05
+- train_batch_size: 2
+- eval_batch_size: 2
 - seed: 42
 - gradient_accumulation_steps: 4
+- total_train_batch_size: 8
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- num_epochs: 20
 ### Training results
+| Training Loss | Epoch | Step | Validation Loss | Rouge1  | Rouge2  | Rougel  | Rougelsum | Gen Len  |
+|:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:--------:|
+| No log        | 1.0   | 232  | 2.4684          | 33.3966 | 9.9982  | 19.6472 | 27.3865   | 126.8712 |
+| No log        | 2.0   | 465  | 2.3905          | 34.2228 | 10.5192 | 20.3584 | 27.4209   | 126.8712 |
+| 2.8064        | 3.0   | 697  | 2.3486          | 34.5949 | 11.0682 | 20.8844 | 27.3403   | 126.6738 |
+| 2.8064        | 4.0   | 930  | 2.3193          | 34.6865 | 11.0996 | 20.9574 | 27.337    | 126.2318 |
+| 2.5767        | 5.0   | 1162 | 2.2963          | 34.3101 | 11.0183 | 20.8461 | 27.155    | 126.721  |
+| 2.5767        | 6.0   | 1395 | 2.2774          | 34.9299 | 11.5927 | 21.3549 | 27.7805   | 126.4249 |
+| 2.483         | 7.0   | 1627 | 2.2646          | 34.4741 | 11.1383 | 21.2722 | 27.3822   | 126.3004 |
+| 2.483         | 8.0   | 1860 | 2.2521          | 34.9384 | 11.2651 | 21.3153 | 27.5792   | 126.9828 |
+| 2.4134        | 9.0   | 2092 | 2.2410          | 34.9546 | 11.424  | 21.1427 | 27.6608   | 126.7854 |
+| 2.4134        | 10.0  | 2325 | 2.2326          | 34.7566 | 11.5721 | 21.4418 | 27.5167   | 126.7425 |
+| 2.3576        | 11.0  | 2557 | 2.2263          | 34.5968 | 11.623  | 21.2384 | 27.365    | 126.4506 |
+| 2.3576        | 12.0  | 2790 | 2.2194          | 34.7363 | 11.5612 | 21.47   | 27.6572   | 126.5665 |
+| 2.3288        | 13.0  | 3022 | 2.2142          | 34.971  | 11.7203 | 21.49   | 27.7418   | 126.5665 |
+| 2.3288        | 14.0  | 3255 | 2.2114          | 34.761  | 11.6621 | 21.3963 | 27.568    | 126.6266 |
+| 2.3288        | 15.0  | 3487 | 2.2064          | 34.9197 | 11.5475 | 21.4017 | 27.6388   | 126.3305 |
+| 2.2951        | 16.0  | 3720 | 2.2067          | 34.8124 | 11.615  | 21.5177 | 27.605    | 126.3605 |
+| 2.2951        | 17.0  | 3952 | 2.2042          | 34.7608 | 11.4738 | 21.3464 | 27.379    | 126.4034 |
+| 2.2832        | 18.0  | 4185 | 2.2032          | 34.7593 | 11.6239 | 21.4029 | 27.4669   | 126.2489 |
+| 2.2832        | 19.0  | 4417 | 2.2029          | 34.8386 | 11.5919 | 21.4719 | 27.5147   | 126.2318 |
+| 2.2571        | 19.96 | 4640 | 2.2026          | 34.9358 | 11.6813 | 21.4935 | 27.4979   | 126.3262 |
 ### Framework versions

pytorch_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:6966dfd32e10b7b7ff74248f30d567803eac7ea4a89b1c2835c6c6c3bf48cbeb
 size 2329703026

 version https://git-lfs.github.com/spec/v1
+oid sha256:0a2e0948183a0d455972c242ee4d5c559a016ca4442cc1504ff32043d371c508
 size 2329703026