pythia-160m-deduped-aid
Model description
This model is a finetune of EleutherAI/pythia-160m-deduped (from when it was instead pythia-125m-deduped
), on the text_adventures.txt
dataset originally intended for AI Dungeon 2. Performance will be very poor, as expected by the small model, and generations may be offensive thanks to its training data.
This model was trained for testing purposes as the successor to Merry/AID-Neo-125M and was intended for use with KoboldAI. A temperature of 0.5
and a repetition penalty of 1.05
were tested.
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 2
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 3.0
Training results
Framework versions
- Transformers 4.26.0.dev0
- Pytorch 1.13.1+cu116
- Datasets 2.8.0
- Tokenizers 0.13.2
- Downloads last month
- 24
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.