--- license: apache-2.0 tags: - generated_from_trainer base_model: distilgpt2 model-index: - name: StatementOfWork_Generator_Omega2 results: [] --- # StatementOfWork_Generator_Omega2 This model is a fine-tuned version of [distilgpt2](https://huggingface.co/distilgpt2) on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 0.9436 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 2e-05 - train_batch_size: 50 - eval_batch_size: 50 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 50 ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:-----:|:----:|:---------------:| | No log | 1.0 | 15 | 0.9674 | | No log | 2.0 | 30 | 0.9673 | | No log | 3.0 | 45 | 0.9633 | | No log | 4.0 | 60 | 0.9629 | | No log | 5.0 | 75 | 0.9633 | | No log | 6.0 | 90 | 0.9634 | | No log | 7.0 | 105 | 0.9635 | | No log | 8.0 | 120 | 0.9603 | | No log | 9.0 | 135 | 0.9550 | | No log | 10.0 | 150 | 0.9583 | | No log | 11.0 | 165 | 0.9574 | | No log | 12.0 | 180 | 0.9544 | | No log | 13.0 | 195 | 0.9540 | | No log | 14.0 | 210 | 0.9575 | | No log | 15.0 | 225 | 0.9530 | | No log | 16.0 | 240 | 0.9519 | | No log | 17.0 | 255 | 0.9514 | | No log | 18.0 | 270 | 0.9534 | | No log | 19.0 | 285 | 0.9498 | | No log | 20.0 | 300 | 0.9554 | | No log | 21.0 | 315 | 0.9474 | | No log | 22.0 | 330 | 0.9539 | | No log | 23.0 | 345 | 0.9470 | | No log | 24.0 | 360 | 0.9491 | | No log | 25.0 | 375 | 0.9478 | | No log | 26.0 | 390 | 0.9454 | | No log | 27.0 | 405 | 0.9472 | | No log | 28.0 | 420 | 0.9481 | | No log | 29.0 | 435 | 0.9467 | | No log | 30.0 | 450 | 0.9473 | | No log | 31.0 | 465 | 0.9478 | | No log | 32.0 | 480 | 0.9439 | | No log | 33.0 | 495 | 0.9453 | | 0.2954 | 34.0 | 510 | 0.9446 | | 0.2954 | 35.0 | 525 | 0.9453 | | 0.2954 | 36.0 | 540 | 0.9452 | | 0.2954 | 37.0 | 555 | 0.9442 | | 0.2954 | 38.0 | 570 | 0.9459 | | 0.2954 | 39.0 | 585 | 0.9442 | | 0.2954 | 40.0 | 600 | 0.9443 | | 0.2954 | 41.0 | 615 | 0.9445 | | 0.2954 | 42.0 | 630 | 0.9442 | | 0.2954 | 43.0 | 645 | 0.9441 | | 0.2954 | 44.0 | 660 | 0.9453 | | 0.2954 | 45.0 | 675 | 0.9447 | | 0.2954 | 46.0 | 690 | 0.9441 | | 0.2954 | 47.0 | 705 | 0.9438 | | 0.2954 | 48.0 | 720 | 0.9438 | | 0.2954 | 49.0 | 735 | 0.9437 | | 0.2954 | 50.0 | 750 | 0.9436 | ### Framework versions - Transformers 4.38.2 - Pytorch 2.2.1+cu121 - Datasets 2.18.0 - Tokenizers 0.15.2