yadheedhya
commited on
Commit
•
08a65c9
1
Parent(s):
fc7624f
Update README.md
Browse files
README.md
CHANGED
@@ -1,5 +1,5 @@
|
|
1 |
---
|
2 |
-
license:
|
3 |
tags:
|
4 |
- generated_from_trainer
|
5 |
datasets:
|
@@ -29,6 +29,8 @@ should probably proofread and complete it, then remove this comment. -->
|
|
29 |
|
30 |
# base
|
31 |
|
|
|
|
|
32 |
This model is a fine-tuned version of [google/flan-t5-base](https://huggingface.co/google/flan-t5-base) on the cnn_dailymail 3.0.0 dataset.
|
33 |
It achieves the following results on the evaluation set:
|
34 |
- Loss: 1.4232
|
@@ -40,17 +42,34 @@ It achieves the following results on the evaluation set:
|
|
40 |
|
41 |
## Model description
|
42 |
|
43 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
44 |
|
45 |
## Intended uses & limitations
|
46 |
|
47 |
-
|
|
|
|
|
48 |
|
49 |
## Training and evaluation data
|
50 |
|
51 |
-
|
|
|
|
|
|
|
|
|
|
|
52 |
|
53 |
## Training procedure
|
|
|
|
|
54 |
|
55 |
### Training hyperparameters
|
56 |
|
|
|
1 |
---
|
2 |
+
license: apache-2.0
|
3 |
tags:
|
4 |
- generated_from_trainer
|
5 |
datasets:
|
|
|
29 |
|
30 |
# base
|
31 |
|
32 |
+
![model image](https://s3.amazonaws.com/moonup/production/uploads/1666363435475-62441d1d9fdefb55a0b7d12c.png)
|
33 |
+
|
34 |
This model is a fine-tuned version of [google/flan-t5-base](https://huggingface.co/google/flan-t5-base) on the cnn_dailymail 3.0.0 dataset.
|
35 |
It achieves the following results on the evaluation set:
|
36 |
- Loss: 1.4232
|
|
|
42 |
|
43 |
## Model description
|
44 |
|
45 |
+
- **Model type:** Language model
|
46 |
+
- **Language(s) (NLP):** English, Spanish, Japanese, Persian, Hindi, French, Chinese, Bengali, Gujarati, German, Telugu, Italian, Arabic, Polish, Tamil, Marathi, Malayalam, Oriya, Panjabi, Portuguese, Urdu, Galician, Hebrew, Korean, Catalan, Thai, Dutch, Indonesian, Vietnamese, Bulgarian, Filipino, Central Khmer, Lao, Turkish, Russian, Croatian, Swedish, Yoruba, Kurdish, Burmese, Malay, Czech, Finnish, Somali, Tagalog, Swahili, Sinhala, Kannada, Zhuang, Igbo, Xhosa, Romanian, Haitian, Estonian, Slovak, Lithuanian, Greek, Nepali, Assamese, Norwegian
|
47 |
+
- **License:** Apache 2.0
|
48 |
+
- **Related Models:** [All FLAN-T5 Checkpoints](https://huggingface.co/models?search=flan-t5)
|
49 |
+
- **Original Checkpoints:** [All Original FLAN-T5 Checkpoints](https://github.com/google-research/t5x/blob/main/docs/models.md#flan-t5-checkpoints)
|
50 |
+
- **Resources for more information:**
|
51 |
+
- [Research paper](https://arxiv.org/pdf/2210.11416.pdf)
|
52 |
+
- [GitHub Repo](https://github.com/google-research/t5x)
|
53 |
+
- [Hugging Face FLAN-T5 Docs (Similar to T5) ](https://huggingface.co/docs/transformers/model_doc/t5)
|
54 |
|
55 |
## Intended uses & limitations
|
56 |
|
57 |
+
The information below in this section are copied from the model's [official model card](https://arxiv.org/pdf/2210.11416.pdf):
|
58 |
+
|
59 |
+
> Language models, including Flan-T5, can potentially be used for language generation in a harmful way, according to Rae et al. (2021). Flan-T5 should not be used directly in any application,
|
60 |
|
61 |
## Training and evaluation data
|
62 |
|
63 |
+
- Loss: 1.4232
|
64 |
+
- Rouge1: 42.1388
|
65 |
+
- Rouge2: 19.7696
|
66 |
+
- Rougel: 30.1512
|
67 |
+
- Rougelsum: 39.3222
|
68 |
+
- Gen Len: 71.8562
|
69 |
|
70 |
## Training procedure
|
71 |
+
Training procedure example notebook for flan-T5 and pushing it to hub
|
72 |
+
[https://github.com/EveripediaNetwork/ai/blob/main/notebooks/Fine-Tuning-Flan-T5_1.ipynb](https://github.com/EveripediaNetwork/ai/blob/main/notebooks/Fine-Tuning-Flan-T5_1.ipynb)
|
73 |
|
74 |
### Training hyperparameters
|
75 |
|