genz_model1 / README.md

update model card README.md

477d446 over 1 year ago

4.84 kB

	---
	license: apache-2.0
	base_model: t5-small
	tags:
	- generated_from_trainer
	metrics:
	- bleu
	model-index:
	- name: genz_model1
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# genz_model1

	This model is a fine-tuned version of [t5-small](https://huggingface.co/t5-small) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 1.2337
	- Bleu: 37.5629
	- Gen Len: 15.215

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 16
	- eval_batch_size: 16
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 50

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Bleu \| Gen Len \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:-------:\|:-------:\|
	\| No log \| 1.0 \| 107 \| 2.0122 \| 27.3045 \| 15.4416 \|
	\| No log \| 2.0 \| 214 \| 1.8166 \| 32.1348 \| 15.285 \|
	\| No log \| 3.0 \| 321 \| 1.7273 \| 32.6473 \| 15.4603 \|
	\| No log \| 4.0 \| 428 \| 1.6669 \| 32.8528 \| 15.514 \|
	\| 1.9696 \| 5.0 \| 535 \| 1.6214 \| 33.6367 \| 15.507 \|
	\| 1.9696 \| 6.0 \| 642 \| 1.5815 \| 33.5927 \| 15.4743 \|
	\| 1.9696 \| 7.0 \| 749 \| 1.5481 \| 34.0762 \| 15.5 \|
	\| 1.9696 \| 8.0 \| 856 \| 1.5236 \| 34.3891 \| 15.4416 \|
	\| 1.9696 \| 9.0 \| 963 \| 1.4948 \| 34.0203 \| 15.4673 \|
	\| 1.56 \| 10.0 \| 1070 \| 1.4733 \| 33.9927 \| 15.4416 \|
	\| 1.56 \| 11.0 \| 1177 \| 1.4559 \| 34.468 \| 15.3972 \|
	\| 1.56 \| 12.0 \| 1284 \| 1.4334 \| 34.3625 \| 15.3785 \|
	\| 1.56 \| 13.0 \| 1391 \| 1.4167 \| 34.721 \| 15.3388 \|
	\| 1.56 \| 14.0 \| 1498 \| 1.4017 \| 34.7409 \| 15.4136 \|
	\| 1.4159 \| 15.0 \| 1605 \| 1.3886 \| 34.7995 \| 15.3738 \|
	\| 1.4159 \| 16.0 \| 1712 \| 1.3733 \| 34.7944 \| 15.3879 \|
	\| 1.4159 \| 17.0 \| 1819 \| 1.3627 \| 35.0969 \| 15.4089 \|
	\| 1.4159 \| 18.0 \| 1926 \| 1.3517 \| 35.157 \| 15.3505 \|
	\| 1.3203 \| 19.0 \| 2033 \| 1.3452 \| 34.9134 \| 15.2126 \|
	\| 1.3203 \| 20.0 \| 2140 \| 1.3325 \| 35.5535 \| 15.3084 \|
	\| 1.3203 \| 21.0 \| 2247 \| 1.3268 \| 35.9899 \| 15.2056 \|
	\| 1.3203 \| 22.0 \| 2354 \| 1.3163 \| 36.1116 \| 15.243 \|
	\| 1.3203 \| 23.0 \| 2461 \| 1.3115 \| 36.2296 \| 15.1752 \|
	\| 1.2505 \| 24.0 \| 2568 \| 1.3038 \| 36.5635 \| 15.2056 \|
	\| 1.2505 \| 25.0 \| 2675 \| 1.2996 \| 36.7848 \| 15.2243 \|
	\| 1.2505 \| 26.0 \| 2782 \| 1.2914 \| 36.3015 \| 15.2336 \|
	\| 1.2505 \| 27.0 \| 2889 \| 1.2856 \| 36.73 \| 15.2664 \|
	\| 1.2505 \| 28.0 \| 2996 \| 1.2810 \| 36.8486 \| 15.2897 \|
	\| 1.1949 \| 29.0 \| 3103 \| 1.2780 \| 37.1042 \| 15.243 \|
	\| 1.1949 \| 30.0 \| 3210 \| 1.2729 \| 37.1394 \| 15.2617 \|
	\| 1.1949 \| 31.0 \| 3317 \| 1.2673 \| 36.9584 \| 15.2967 \|
	\| 1.1949 \| 32.0 \| 3424 \| 1.2637 \| 37.4488 \| 15.2547 \|
	\| 1.156 \| 33.0 \| 3531 \| 1.2607 \| 37.3112 \| 15.278 \|
	\| 1.156 \| 34.0 \| 3638 \| 1.2573 \| 37.5048 \| 15.2313 \|
	\| 1.156 \| 35.0 \| 3745 \| 1.2532 \| 37.4771 \| 15.2967 \|
	\| 1.156 \| 36.0 \| 3852 \| 1.2512 \| 37.4967 \| 15.3014 \|
	\| 1.156 \| 37.0 \| 3959 \| 1.2494 \| 37.5326 \| 15.236 \|
	\| 1.1272 \| 38.0 \| 4066 \| 1.2470 \| 37.5807 \| 15.2266 \|
	\| 1.1272 \| 39.0 \| 4173 \| 1.2455 \| 37.5478 \| 15.229 \|
	\| 1.1272 \| 40.0 \| 4280 \| 1.2435 \| 37.7117 \| 15.236 \|
	\| 1.1272 \| 41.0 \| 4387 \| 1.2402 \| 37.3874 \| 15.2547 \|
	\| 1.1272 \| 42.0 \| 4494 \| 1.2389 \| 37.584 \| 15.243 \|
	\| 1.11 \| 43.0 \| 4601 \| 1.2377 \| 37.5384 \| 15.2336 \|
	\| 1.11 \| 44.0 \| 4708 \| 1.2364 \| 37.5339 \| 15.2453 \|
	\| 1.11 \| 45.0 \| 4815 \| 1.2362 \| 37.5626 \| 15.229 \|
	\| 1.11 \| 46.0 \| 4922 \| 1.2355 \| 37.518 \| 15.222 \|
	\| 1.0999 \| 47.0 \| 5029 \| 1.2343 \| 37.5847 \| 15.243 \|
	\| 1.0999 \| 48.0 \| 5136 \| 1.2339 \| 37.5871 \| 15.2313 \|
	\| 1.0999 \| 49.0 \| 5243 \| 1.2338 \| 37.5592 \| 15.236 \|
	\| 1.0999 \| 50.0 \| 5350 \| 1.2337 \| 37.5629 \| 15.215 \|


	### Framework versions

	- Transformers 4.31.0
	- Pytorch 2.0.1+cu118
	- Datasets 2.14.3
	- Tokenizers 0.13.3

	---
	license: apache-2.0
	base_model: t5-small
	tags:
	- generated_from_trainer
	metrics:
	- bleu
	model-index:
	- name: genz_model1
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# genz_model1

	This model is a fine-tuned version of [t5-small](https://huggingface.co/t5-small) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 1.2337
	- Bleu: 37.5629
	- Gen Len: 15.215

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 16
	- eval_batch_size: 16
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 50

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Bleu \| Gen Len \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:-------:\|:-------:\|
	\| No log \| 1.0 \| 107 \| 2.0122 \| 27.3045 \| 15.4416 \|
	\| No log \| 2.0 \| 214 \| 1.8166 \| 32.1348 \| 15.285 \|
	\| No log \| 3.0 \| 321 \| 1.7273 \| 32.6473 \| 15.4603 \|
	\| No log \| 4.0 \| 428 \| 1.6669 \| 32.8528 \| 15.514 \|
	\| 1.9696 \| 5.0 \| 535 \| 1.6214 \| 33.6367 \| 15.507 \|
	\| 1.9696 \| 6.0 \| 642 \| 1.5815 \| 33.5927 \| 15.4743 \|
	\| 1.9696 \| 7.0 \| 749 \| 1.5481 \| 34.0762 \| 15.5 \|
	\| 1.9696 \| 8.0 \| 856 \| 1.5236 \| 34.3891 \| 15.4416 \|
	\| 1.9696 \| 9.0 \| 963 \| 1.4948 \| 34.0203 \| 15.4673 \|
	\| 1.56 \| 10.0 \| 1070 \| 1.4733 \| 33.9927 \| 15.4416 \|
	\| 1.56 \| 11.0 \| 1177 \| 1.4559 \| 34.468 \| 15.3972 \|
	\| 1.56 \| 12.0 \| 1284 \| 1.4334 \| 34.3625 \| 15.3785 \|
	\| 1.56 \| 13.0 \| 1391 \| 1.4167 \| 34.721 \| 15.3388 \|
	\| 1.56 \| 14.0 \| 1498 \| 1.4017 \| 34.7409 \| 15.4136 \|
	\| 1.4159 \| 15.0 \| 1605 \| 1.3886 \| 34.7995 \| 15.3738 \|
	\| 1.4159 \| 16.0 \| 1712 \| 1.3733 \| 34.7944 \| 15.3879 \|
	\| 1.4159 \| 17.0 \| 1819 \| 1.3627 \| 35.0969 \| 15.4089 \|
	\| 1.4159 \| 18.0 \| 1926 \| 1.3517 \| 35.157 \| 15.3505 \|
	\| 1.3203 \| 19.0 \| 2033 \| 1.3452 \| 34.9134 \| 15.2126 \|
	\| 1.3203 \| 20.0 \| 2140 \| 1.3325 \| 35.5535 \| 15.3084 \|
	\| 1.3203 \| 21.0 \| 2247 \| 1.3268 \| 35.9899 \| 15.2056 \|
	\| 1.3203 \| 22.0 \| 2354 \| 1.3163 \| 36.1116 \| 15.243 \|
	\| 1.3203 \| 23.0 \| 2461 \| 1.3115 \| 36.2296 \| 15.1752 \|
	\| 1.2505 \| 24.0 \| 2568 \| 1.3038 \| 36.5635 \| 15.2056 \|
	\| 1.2505 \| 25.0 \| 2675 \| 1.2996 \| 36.7848 \| 15.2243 \|
	\| 1.2505 \| 26.0 \| 2782 \| 1.2914 \| 36.3015 \| 15.2336 \|
	\| 1.2505 \| 27.0 \| 2889 \| 1.2856 \| 36.73 \| 15.2664 \|
	\| 1.2505 \| 28.0 \| 2996 \| 1.2810 \| 36.8486 \| 15.2897 \|
	\| 1.1949 \| 29.0 \| 3103 \| 1.2780 \| 37.1042 \| 15.243 \|
	\| 1.1949 \| 30.0 \| 3210 \| 1.2729 \| 37.1394 \| 15.2617 \|
	\| 1.1949 \| 31.0 \| 3317 \| 1.2673 \| 36.9584 \| 15.2967 \|
	\| 1.1949 \| 32.0 \| 3424 \| 1.2637 \| 37.4488 \| 15.2547 \|
	\| 1.156 \| 33.0 \| 3531 \| 1.2607 \| 37.3112 \| 15.278 \|
	\| 1.156 \| 34.0 \| 3638 \| 1.2573 \| 37.5048 \| 15.2313 \|
	\| 1.156 \| 35.0 \| 3745 \| 1.2532 \| 37.4771 \| 15.2967 \|
	\| 1.156 \| 36.0 \| 3852 \| 1.2512 \| 37.4967 \| 15.3014 \|
	\| 1.156 \| 37.0 \| 3959 \| 1.2494 \| 37.5326 \| 15.236 \|
	\| 1.1272 \| 38.0 \| 4066 \| 1.2470 \| 37.5807 \| 15.2266 \|
	\| 1.1272 \| 39.0 \| 4173 \| 1.2455 \| 37.5478 \| 15.229 \|
	\| 1.1272 \| 40.0 \| 4280 \| 1.2435 \| 37.7117 \| 15.236 \|
	\| 1.1272 \| 41.0 \| 4387 \| 1.2402 \| 37.3874 \| 15.2547 \|
	\| 1.1272 \| 42.0 \| 4494 \| 1.2389 \| 37.584 \| 15.243 \|
	\| 1.11 \| 43.0 \| 4601 \| 1.2377 \| 37.5384 \| 15.2336 \|
	\| 1.11 \| 44.0 \| 4708 \| 1.2364 \| 37.5339 \| 15.2453 \|
	\| 1.11 \| 45.0 \| 4815 \| 1.2362 \| 37.5626 \| 15.229 \|
	\| 1.11 \| 46.0 \| 4922 \| 1.2355 \| 37.518 \| 15.222 \|
	\| 1.0999 \| 47.0 \| 5029 \| 1.2343 \| 37.5847 \| 15.243 \|
	\| 1.0999 \| 48.0 \| 5136 \| 1.2339 \| 37.5871 \| 15.2313 \|
	\| 1.0999 \| 49.0 \| 5243 \| 1.2338 \| 37.5592 \| 15.236 \|
	\| 1.0999 \| 50.0 \| 5350 \| 1.2337 \| 37.5629 \| 15.215 \|


	### Framework versions

	- Transformers 4.31.0
	- Pytorch 2.0.1+cu118
	- Datasets 2.14.3
	- Tokenizers 0.13.3