Remove [BOS] from descriptions

Files changed (7) hide show

README.md CHANGED Viewed

@@ -16,8 +16,8 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [gpt2-large](https://huggingface.co/gpt2-large) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 2.0773
-- Accuracy: 0.8482
 ## Model description
@@ -49,16 +49,16 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step  | Validation Loss | Accuracy |
 |:-------------:|:-----:|:-----:|:---------------:|:--------:|
-| 2.4891        | 0.19  | 1000  | 2.4467          | 0.8446   |
-| 2.7019        | 0.37  | 2000  | 2.3208          | 0.8456   |
-| 2.5278        | 0.56  | 3000  | 2.2470          | 0.8464   |
-| 2.0687        | 0.74  | 4000  | 2.1953          | 0.8468   |
-| 2.1738        | 0.93  | 5000  | 2.1543          | 0.8472   |
-| 1.8554        | 1.12  | 6000  | 2.1500          | 0.8475   |
-| 1.9276        | 1.3   | 7000  | 2.1223          | 0.8477   |
-| 1.7988        | 1.49  | 8000  | 2.1120          | 0.8479   |
-| 2.0632        | 1.67  | 9000  | 2.0973          | 0.8480   |
-| 1.9586        | 1.86  | 10000 | 2.0826          | 0.8481   |
 ### Framework versions

 This model is a fine-tuned version of [gpt2-large](https://huggingface.co/gpt2-large) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 2.0808
+- Accuracy: 0.8556
 ## Model description
 | Training Loss | Epoch | Step  | Validation Loss | Accuracy |
 |:-------------:|:-----:|:-----:|:---------------:|:--------:|
+| 2.4827        | 0.19  | 1000  | 2.4565          | 0.8520   |
+| 2.6468        | 0.37  | 2000  | 2.3303          | 0.8530   |
+| 2.5106        | 0.56  | 3000  | 2.2487          | 0.8537   |
+| 2.0732        | 0.74  | 4000  | 2.2020          | 0.8541   |
+| 2.159         | 0.93  | 5000  | 2.1594          | 0.8545   |
+| 1.856         | 1.12  | 6000  | 2.1518          | 0.8548   |
+| 1.9138        | 1.3   | 7000  | 2.1261          | 0.8551   |
+| 1.8055        | 1.49  | 8000  | 2.1126          | 0.8552   |
+| 2.0385        | 1.67  | 9000  | 2.1008          | 0.8554   |
+| 1.9648        | 1.86  | 10000 | 2.0858          | 0.8555   |
 ### Framework versions

all_results.json CHANGED Viewed

@@ -1,15 +1,15 @@
 {
     "epoch": 2.0,
-    "eval_accuracy": 0.8481798046914326,
-    "eval_loss": 2.0772647857666016,
-    "eval_runtime": 129.398,
     "eval_samples": 10750,
-    "eval_samples_per_second": 83.077,
-    "eval_steps_per_second": 10.387,
-    "perplexity": 7.982604892014763,
-    "train_loss": 2.1184611846914603,
-    "train_runtime": 4663.4223,
     "train_samples": 43003,
-    "train_samples_per_second": 18.443,
-    "train_steps_per_second": 2.306
 }

 {
     "epoch": 2.0,
+    "eval_accuracy": 0.8555773331979283,
+    "eval_loss": 2.0807912349700928,
+    "eval_runtime": 125.0813,
     "eval_samples": 10750,
+    "eval_samples_per_second": 85.944,
+    "eval_steps_per_second": 10.745,
+    "perplexity": 8.010804836289337,
+    "train_loss": 2.1236886782571673,
+    "train_runtime": 4504.4872,
     "train_samples": 43003,
+    "train_samples_per_second": 19.093,
+    "train_steps_per_second": 2.387
 }

eval_results.json CHANGED Viewed

@@ -1,10 +1,10 @@
 {
     "epoch": 2.0,
-    "eval_accuracy": 0.8481798046914326,
-    "eval_loss": 2.0772647857666016,
-    "eval_runtime": 129.398,
     "eval_samples": 10750,
-    "eval_samples_per_second": 83.077,
-    "eval_steps_per_second": 10.387,
-    "perplexity": 7.982604892014763
 }

 {
     "epoch": 2.0,
+    "eval_accuracy": 0.8555773331979283,
+    "eval_loss": 2.0807912349700928,
+    "eval_runtime": 125.0813,
     "eval_samples": 10750,
+    "eval_samples_per_second": 85.944,
+    "eval_steps_per_second": 10.745,
+    "perplexity": 8.010804836289337
 }

pytorch_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:4c68c6a4031f800eda2b9d1b40fe6faa1bd0d8c017a068f2c3c79fc7f83d4eca
 size 3134045245

 version https://git-lfs.github.com/spec/v1
+oid sha256:5a854f911fcf67887abe92522ba2b32cb2eb42925c185792b4a089aaae7a38ae
 size 3134045245

train_results.json CHANGED Viewed

@@ -1,8 +1,8 @@
 {
     "epoch": 2.0,
-    "train_loss": 2.1184611846914603,
-    "train_runtime": 4663.4223,
     "train_samples": 43003,
-    "train_samples_per_second": 18.443,
-    "train_steps_per_second": 2.306
 }

 {
     "epoch": 2.0,
+    "train_loss": 2.1236886782571673,
+    "train_runtime": 4504.4872,
     "train_samples": 43003,
+    "train_samples_per_second": 19.093,
+    "train_steps_per_second": 2.387
 }

trainer_state.json CHANGED Viewed

The diff for this file is too large to render. See raw diff

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:2354ff3ec9e59058b3289218560a957a3cbc5faa14c40080e2daf94d9e8c0c3f
 size 3451

 version https://git-lfs.github.com/spec/v1
+oid sha256:0a4f0f455cfe353b38c6958e9dd682a9cb66e8e0afbc23961e3ce8be69bb0ab7
 size 3451