QLoRA applied #2

Files changed (6) hide show

README.md CHANGED Viewed

@@ -2,6 +2,8 @@
 base_model: ybelkada/falcon-7b-sharded-bf16
 tags:
 - generated_from_trainer
 model-index:
 - name: falcon-7b-sharded-2
   results: []
@@ -13,6 +15,9 @@ should probably proofread and complete it, then remove this comment. -->
 # falcon-7b-sharded-2
 This model is a fine-tuned version of [ybelkada/falcon-7b-sharded-bf16](https://huggingface.co/ybelkada/falcon-7b-sharded-bf16) on an unknown dataset.
 ## Model description
@@ -32,11 +37,9 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 0.0002
-- train_batch_size: 4
 - eval_batch_size: 1
 - seed: 42
-- gradient_accumulation_steps: 4
-- total_train_batch_size: 16
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: constant
 - lr_scheduler_warmup_ratio: 0.03
@@ -44,6 +47,10 @@ The following hyperparameters were used during training:
 ### Training results
 ### Framework versions
@@ -51,4 +58,4 @@ The following hyperparameters were used during training:
 - Transformers 4.34.0
 - Pytorch 2.0.1+cu118
 - Datasets 2.14.5
-- Tokenizers 0.14.0

 base_model: ybelkada/falcon-7b-sharded-bf16
 tags:
 - generated_from_trainer
+metrics:
+- f1
 model-index:
 - name: falcon-7b-sharded-2
   results: []
 # falcon-7b-sharded-2
 This model is a fine-tuned version of [ybelkada/falcon-7b-sharded-bf16](https://huggingface.co/ybelkada/falcon-7b-sharded-bf16) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: nan
+- F1: 0.0337
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 0.0002
+- train_batch_size: 2
 - eval_batch_size: 1
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: constant
 - lr_scheduler_warmup_ratio: 0.03
 ### Training results
+| Training Loss | Epoch | Step | Validation Loss | F1     |
+|:-------------:|:-----:|:----:|:---------------:|:------:|
+| 7.6119        | 1.0   | 442  | nan             | 0.0337 |
+| 6.8711        | 1.13  | 500  | nan             | 0.0337 |
 ### Framework versions
 - Transformers 4.34.0
 - Pytorch 2.0.1+cu118
 - Datasets 2.14.5
+- Tokenizers 0.14.1

adapter_config.json CHANGED Viewed

@@ -3,8 +3,8 @@
   "base_model_name_or_path": "ybelkada/falcon-7b-sharded-bf16",
   "fan_in_fan_out": false,
   "feedforward_modules": [
-    "dense_h_to_4h",
-    "dense_4h_to_h"
   ],
   "inference_mode": true,
   "init_ia3_weights": true,
@@ -14,8 +14,8 @@
   "target_modules": [
     "query_key_value",
     "dense",
-    "dense_h_to_4h",
-    "dense_4h_to_h"
   ],
-  "task_type": "CAUSAL_LM"
 }

   "base_model_name_or_path": "ybelkada/falcon-7b-sharded-bf16",
   "fan_in_fan_out": false,
   "feedforward_modules": [
+    "dense_4h_to_h",
+    "dense_h_to_4h"
   ],
   "inference_mode": true,
   "init_ia3_weights": true,
   "target_modules": [
     "query_key_value",
     "dense",
+    "dense_4h_to_h",
+    "dense_h_to_4h"
   ],
+  "task_type": "QUESTION_ANSWERING"
 }

adapter_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:ffe95f8f7e1cefccd3afc34d2fa1a2ef11c4423c6656dde21eb635a3af98429d
-size 4133325

 version https://git-lfs.github.com/spec/v1
+oid sha256:20c2c7dd065f99eab629a983caa759bb943b48e1e0572a30aee8f5332a9a10bd
+size 4170325

tokenizer.json CHANGED Viewed

@@ -2,9 +2,9 @@
   "version": "1.0",
   "truncation": {
     "direction": "Right",
-    "max_length": 2048,
-    "strategy": "LongestFirst",
-    "stride": 0
   },
   "padding": null,
   "added_tokens": [

   "version": "1.0",
   "truncation": {
     "direction": "Right",
+    "max_length": 384,
+    "strategy": "OnlySecond",
+    "stride": 128
   },
   "padding": null,
   "added_tokens": [

tokenizer_config.json CHANGED Viewed

@@ -113,11 +113,10 @@
   ],
   "clean_up_tokenization_spaces": true,
   "eos_token": "<|endoftext|>",
-  "max_length": 512,
   "model_max_length": 2048,
-  "pad_token": "<|endoftext|>",
-  "stride": 0,
-  "tokenizer_class": "PreTrainedTokenizerFast",
-  "truncation_side": "right",
-  "truncation_strategy": "longest_first"
 }

   ],
   "clean_up_tokenization_spaces": true,
   "eos_token": "<|endoftext|>",
+  "model_input_names": [
+    "input_ids",
+    "attention_mask"
+  ],
   "model_max_length": 2048,
+  "tokenizer_class": "PreTrainedTokenizerFast"
 }

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:939d96f0ae01a0b7f2b8336b1421e96acd46d09cd6f7390adcd9fadae6c6ac6a
 size 4091

 version https://git-lfs.github.com/spec/v1
+oid sha256:3029d2fc1e20eadaf5b42a829fedb51d99fc527bbdcf3e8bd5112f5bb38a3e62
 size 4091