psxjp5
/

mt5-small_25

Text2Text Generation

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

mt5-small_25 / README.md

psxjp5's picture

update model card README.md

fa027d0 about 1 year ago

|

history blame contribute delete

No virus

2.94 kB

	---
	license: apache-2.0
	base_model: google/mt5-small
	tags:
	- generated_from_trainer
	metrics:
	- rouge
	- bleu
	model-index:
	- name: mt5-small_test
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# mt5-small_test

	This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.7284
	- Rouge1: 43.3718
	- Rouge2: 37.5973
	- Rougel: 42.0502
	- Rougelsum: 42.0648
	- Bleu: 32.8345
	- Gen Len: 12.6063
	- Meteor: 0.3949
	- True negatives: 70.2115
	- False negatives: 11.206
	- Cosine Sim: 0.7485

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.001
	- train_batch_size: 16
	- eval_batch_size: 16
	- seed: 9
	- gradient_accumulation_steps: 8
	- total_train_batch_size: 128
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 20

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Rouge1 \| Rouge2 \| Rougel \| Rougelsum \| Bleu \| Gen Len \| Meteor \| True negatives \| False negatives \| Cosine Sim \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:-------:\|:-------:\|:-------:\|:---------:\|:-------:\|:-------:\|:------:\|:--------------:\|:---------------:\|:----------:\|
	\| 3.1455 \| 1.0 \| 175 \| 0.9832 \| 18.7269 \| 15.517 \| 18.22 \| 18.223 \| 7.0634 \| 7.6229 \| 0.1626 \| 74.6828 \| 57.1687 \| 0.3949 \|
	\| 1.1623 \| 1.99 \| 350 \| 0.8542 \| 38.7603 \| 32.7237 \| 37.3447 \| 37.3752 \| 27.4323 \| 12.5135 \| 0.3487 \| 60.0 \| 15.942 \| 0.6992 \|
	\| 0.9431 \| 2.99 \| 525 \| 0.8017 \| 41.5759 \| 35.6108 \| 40.2536 \| 40.2695 \| 30.7994 \| 12.8117 \| 0.3755 \| 61.2689 \| 12.3447 \| 0.7304 \|
	\| 0.8119 \| 3.98 \| 700 \| 0.7787 \| 43.5881 \| 37.4245 \| 42.1096 \| 42.1248 \| 32.9646 \| 13.2176 \| 0.3947 \| 59.1541 \| 9.5238 \| 0.7582 \|
	\| 0.7235 \| 4.98 \| 875 \| 0.7477 \| 43.4069 \| 37.2246 \| 41.8444 \| 41.8616 \| 32.9345 \| 13.116 \| 0.3946 \| 63.0816 \| 9.8085 \| 0.7561 \|
	\| 0.6493 \| 5.97 \| 1050 \| 0.7266 \| 40.4506 \| 35.0072 \| 39.1206 \| 39.1181 \| 29.0601 \| 11.748 \| 0.3687 \| 75.5287 \| 17.2101 \| 0.7071 \|
	\| 0.5871 \| 6.97 \| 1225 \| 0.7284 \| 43.3718 \| 37.5973 \| 42.0502 \| 42.0648 \| 32.8345 \| 12.6063 \| 0.3949 \| 70.2115 \| 11.206 \| 0.7485 \|


	### Framework versions

	- Transformers 4.31.0
	- Pytorch 2.0.1+cu118
	- Datasets 2.13.1
	- Tokenizers 0.13.3