Training in progress epoch 28

5eef2c7 4 months ago

5.09 kB

	---
	license: apache-2.0
	base_model: google/mt5-base
	tags:
	- generated_from_keras_callback
	model-index:
	- name: pakawadeep/mt5-base-finetuned-ctfl-augmented_2
	results: []
	---

	<!-- This model card has been generated automatically according to the information Keras had access to. You should
	probably proofread and complete it, then remove this comment. -->

	# pakawadeep/mt5-base-finetuned-ctfl-augmented_2

	This model is a fine-tuned version of [google/mt5-base](https://huggingface.co/google/mt5-base) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Train Loss: 0.4142
	- Validation Loss: 0.7725
	- Train Rouge1: 8.6516
	- Train Rouge2: 0.8911
	- Train Rougel: 8.6634
	- Train Rougelsum: 8.8579
	- Train Gen Len: 11.9307
	- Epoch: 28

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- optimizer: {'name': 'AdamWeightDecay', 'learning_rate': 2e-05, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay_rate': 0.01}
	- training_precision: float32

	### Training results

	\| Train Loss \| Validation Loss \| Train Rouge1 \| Train Rouge2 \| Train Rougel \| Train Rougelsum \| Train Gen Len \| Epoch \|
	\|:----------:\|:---------------:\|:------------:\|:------------:\|:------------:\|:---------------:\|:-------------:\|:-----:\|
	\| 6.2987 \| 3.0653 \| 5.1273 \| 0.9901 \| 5.0743 \| 5.2334 \| 7.9208 \| 0 \|
	\| 2.9686 \| 2.1339 \| 6.0644 \| 1.3201 \| 6.1056 \| 6.1056 \| 9.7277 \| 1 \|
	\| 1.9868 \| 1.6122 \| 6.4356 \| 1.6832 \| 6.5535 \| 6.5064 \| 11.2970 \| 2 \|
	\| 1.4819 \| 1.2447 \| 8.2744 \| 2.1782 \| 8.3663 \| 8.4017 \| 11.7921 \| 3 \|
	\| 1.5597 \| 1.4425 \| 7.9208 \| 2.3762 \| 7.9915 \| 8.0387 \| 11.7376 \| 4 \|
	\| 1.2413 \| 1.2182 \| 8.8048 \| 2.1782 \| 8.8826 \| 8.9109 \| 11.8713 \| 5 \|
	\| 1.2091 \| 1.2376 \| 7.7793 \| 1.3861 \| 7.9208 \| 7.9915 \| 11.8861 \| 6 \|
	\| 1.0808 \| 1.1154 \| 8.2744 \| 1.3861 \| 8.4512 \| 8.5219 \| 11.9455 \| 7 \|
	\| 0.9719 \| 1.0578 \| 7.9915 \| 1.1881 \| 8.1683 \| 8.2037 \| 11.9604 \| 8 \|
	\| 1.0497 \| 1.0547 \| 8.4394 \| 1.3861 \| 8.4925 \| 8.6103 \| 11.9158 \| 9 \|
	\| 0.9426 \| 1.0468 \| 8.2744 \| 1.3861 \| 8.4512 \| 8.5219 \| 11.9455 \| 10 \|
	\| 0.8904 \| 0.9902 \| 8.1565 \| 0.8911 \| 8.2567 \| 8.3687 \| 11.9257 \| 11 \|
	\| 0.8371 \| 0.9637 \| 7.7970 \| 0.8911 \| 7.9562 \| 7.9915 \| 11.9505 \| 12 \|
	\| 0.8025 \| 0.9304 \| 7.9562 \| 0.8911 \| 8.0151 \| 8.1094 \| 11.9109 \| 13 \|
	\| 0.7650 \| 0.9143 \| 8.1565 \| 0.8911 \| 8.2567 \| 8.3687 \| 11.9257 \| 14 \|
	\| 0.7276 \| 0.8825 \| 8.1565 \| 0.8911 \| 8.2567 \| 8.3687 \| 11.9059 \| 15 \|
	\| 0.6877 \| 0.8607 \| 8.1565 \| 0.8911 \| 8.2567 \| 8.3687 \| 11.9257 \| 16 \|
	\| 0.6566 \| 0.8303 \| 8.1565 \| 0.8911 \| 8.2567 \| 8.3687 \| 11.9257 \| 17 \|
	\| 0.6205 \| 0.8124 \| 8.1565 \| 0.8911 \| 8.2567 \| 8.3687 \| 11.9307 \| 18 \|
	\| 0.5878 \| 0.7924 \| 8.1565 \| 0.8911 \| 8.2567 \| 8.3687 \| 11.9257 \| 19 \|
	\| 0.5535 \| 0.7724 \| 8.1565 \| 0.8911 \| 8.2567 \| 8.3687 \| 11.8911 \| 20 \|
	\| 0.5243 \| 0.7751 \| 8.1565 \| 0.8911 \| 8.2567 \| 8.3687 \| 11.9109 \| 21 \|
	\| 0.5444 \| 0.8057 \| 8.1565 \| 0.8911 \| 8.2567 \| 8.3687 \| 11.8911 \| 22 \|
	\| 0.5281 \| 0.7875 \| 8.1565 \| 0.8911 \| 8.2567 \| 8.3687 \| 11.8663 \| 23 \|
	\| 0.4990 \| 0.7810 \| 8.1565 \| 0.8911 \| 8.2567 \| 8.3687 \| 11.8762 \| 24 \|
	\| 0.4762 \| 0.7858 \| 8.6516 \| 0.8911 \| 8.6634 \| 8.8579 \| 11.8812 \| 25 \|
	\| 0.4538 \| 0.7793 \| 8.6516 \| 0.8911 \| 8.6634 \| 8.8579 \| 11.9158 \| 26 \|
	\| 0.4330 \| 0.7813 \| 8.6516 \| 0.8911 \| 8.6634 \| 8.8579 \| 11.8762 \| 27 \|
	\| 0.4142 \| 0.7725 \| 8.6516 \| 0.8911 \| 8.6634 \| 8.8579 \| 11.9307 \| 28 \|


	### Framework versions

	- Transformers 4.41.2
	- TensorFlow 2.15.0
	- Datasets 2.20.0
	- Tokenizers 0.19.1