pszemraj
/

tFINE-850m-24x24-v0.5-instruct-L1

 ---
 library_name: transformers
+license: apache-2.0
+base_model: pszemraj/tFINE-850m-24x24-v0.4-flan_aug
+tags:
+- generated_from_trainer
+metrics:
+- rouge
+model-index:
+- name: tFINE-850m-24x24-v0.4-flan_aug-infinity-instruct-7m-T2T_en-1024-v5
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# tFINE-850m-24x24-v0.4-flan_aug-infinity-instruct-7m-T2T_en-1024-v5
+This model is a fine-tuned version of [pszemraj/tFINE-850m-24x24-v0.4-flan_aug](https://huggingface.co/pszemraj/tFINE-850m-24x24-v0.4-flan_aug) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 1.1526
+- Rouge1: 40.1804
+- Rouge2: 23.1008
+- Rougel: 32.3484
+- Rougelsum: 38.2103
+- Gen Len: 422.225
+- Num Input Tokens Seen: 421585440
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 3e-05
+- train_batch_size: 4
+- eval_batch_size: 4
+- seed: 776444
+- gradient_accumulation_steps: 32
+- total_train_batch_size: 128
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: constant_with_warmup
+- lr_scheduler_warmup_ratio: 0.05
+- num_epochs: 1.0
+### Training results
+| Training Loss | Epoch  | Step  | Validation Loss | Rouge1  | Rouge2  | Rougel  | Rougelsum | Gen Len | Input Tokens Seen |
+|:-------------:|:------:|:-----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:|:-----------------:|
+| 1.8808        | 0.0807 | 1000  | 1.7883          | 24.1946 | 12.2099 | 20.4185 | 22.251    | 636.465 | 35147692          |
+| 1.6545        | 0.1613 | 2000  | 1.5985          | 28.9492 | 15.3233 | 23.871  | 26.9919   | 577.04  | 70510224          |
+| 1.5522        | 0.2420 | 3000  | 1.4907          | 30.4033 | 16.1354 | 24.7244 | 28.5037   | 537.77  | 105707144         |
+| 1.5059        | 0.3227 | 4000  | 1.4204          | 34.0294 | 19.2608 | 27.9322 | 32.3166   | 522.495 | 140722844         |
+| 1.4346        | 0.4034 | 5000  | 1.3636          | 34.4104 | 19.4149 | 28.1022 | 32.7299   | 494.68  | 175639924         |
+| 1.3912        | 0.4840 | 6000  | 1.3159          | 36.5059 | 21.2447 | 30.116  | 34.7303   | 469.885 | 210409328         |
+| 1.3148        | 0.5647 | 7000  | 1.2807          | 37.0123 | 21.3666 | 30.11   | 35.0891   | 458.28  | 245601908         |
+| 1.2859        | 0.6454 | 8000  | 1.2492          | 37.05   | 21.0468 | 29.7988 | 35.1882   | 452.495 | 280866724         |
+| 1.298         | 0.7260 | 9000  | 1.2211          | 36.6966 | 20.8189 | 29.7115 | 34.7528   | 464.37  | 316042068         |
+| 1.2834        | 0.8067 | 10000 | 1.1979          | 37.7181 | 20.9926 | 30.3857 | 35.8681   | 446.26  | 351056548         |
+| 1.2577        | 0.8874 | 11000 | 1.1752          | 39.3539 | 23.0123 | 31.9005 | 37.4941   | 424.445 | 386471860         |
+| 1.193         | 0.9680 | 12000 | 1.1526          | 40.1804 | 23.1008 | 32.3484 | 38.2103   | 422.225 | 421585440         |
+### Framework versions
+- Transformers 4.45.1
+- Pytorch 2.4.1+cu124
+- Datasets 3.0.1
+- Tokenizers 0.20.0

generation_config.json ADDED Viewed

	@@ -0,0 +1,6 @@

+{
+  "decoder_start_token_id": 3,
+  "eos_token_id": 2,
+  "pad_token_id": 3,
+  "transformers_version": "4.45.1"
+}