thombrysmith
/

judge_JuDe

@@ -15,7 +15,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [distilgpt2](https://huggingface.co/distilgpt2) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 2.4527
 ## Model description
@@ -35,25 +35,38 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 2e-05
-- train_batch_size: 8
-- eval_batch_size: 8
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- num_epochs: 3.0
 ### Training results
 | Training Loss | Epoch | Step  | Validation Loss |
 |:-------------:|:-----:|:-----:|:---------------:|
-| 2.6296        | 1.0   | 16501 | 2.5378          |
-| 2.5295        | 2.0   | 33002 | 2.4721          |
-| 2.4953        | 3.0   | 49503 | 2.4527          |
 ### Framework versions
 - Transformers 4.34.1
-- Pytorch 2.0.1+cu118
-- Datasets 2.14.6
 - Tokenizers 0.14.1

 This model is a fine-tuned version of [distilgpt2](https://huggingface.co/distilgpt2) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 2.4628
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 2e-05
+- train_batch_size: 16
+- eval_batch_size: 16
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- num_epochs: 4
 ### Training results
 | Training Loss | Epoch | Step  | Validation Loss |
 |:-------------:|:-----:|:-----:|:---------------:|
+| 2.9881        | 0.25  | 2057  | 2.7543          |
+| 2.803         | 0.5   | 4114  | 2.6652          |
+| 2.7298        | 0.75  | 6171  | 2.6124          |
+| 2.687         | 1.0   | 8228  | 2.5772          |
+| 2.6374        | 1.25  | 10285 | 2.5535          |
+| 2.6161        | 1.5   | 12342 | 2.5332          |
+| 2.598         | 1.75  | 14399 | 2.5171          |
+| 2.5773        | 2.0   | 16456 | 2.5050          |
+| 2.5578        | 2.25  | 18513 | 2.4943          |
+| 2.5468        | 2.5   | 20570 | 2.4868          |
+| 2.5385        | 2.75  | 22627 | 2.4783          |
+| 2.5322        | 3.0   | 24684 | 2.4712          |
+| 2.5182        | 3.25  | 26741 | 2.4697          |
+| 2.5188        | 3.5   | 28798 | 2.4657          |
+| 2.513         | 3.75  | 30855 | 2.4630          |
+| 2.5123        | 4.0   | 32912 | 2.4628          |
 ### Framework versions
 - Transformers 4.34.1
+- Pytorch 1.12.1+cu113
+- Datasets 2.8.0
 - Tokenizers 0.14.1

pytorch_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:b7f7eaa0f1cb9e0b49af8e98305932eaa8ec64a3f6fb352f19630c69cc59dc48
-size 327674773

 version https://git-lfs.github.com/spec/v1
+oid sha256:957be5179bae17161922d368beb149dc5e8ce6185b3e8e8f01ba72740420e4ed
+size 327673729

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:ffa3efa4a3ea05d37a385efcc21b4223741370e32e37732d42a827b83384135d
-size 4027

 version https://git-lfs.github.com/spec/v1
+oid sha256:71ce6e17bf669e999658e1f754699bdf891652eed22ed70d9c9811a3561f7546
+size 4015