Here I provide you with a completely un-trained, from scratch model of GPT2.
Which is the 124M parameter version.
This has had all of it's weights randomized and then saved wiping out all previous training.
It was trained for 50 epochs on the original book "Peter Pan" just so I can get the save and tokenization files to upload to hugging face.
So, it is surprisingly almost coherent if you test it to the right with the example text and pressing "compute" just a interesting side note.
What is this and how is it different? This is different than simply downloading a new 'gpt2' because all pre-training has been wiped out (except for the 50 epochs I mentioned).
WHY?! This allows you to train the model from scratch which leaves open more parameters for training specifically for your use-case!
You can see more examples on the original gpt model card page @ https://huggingface.co/gpt2
Example usage:
requirements: pip install transformers
from transformers import GPT2LMHeadModel, GPT2Tokenizer
Substitute 'your_model_name' with the name of your model
model_name_or_path = 'your_model_name'
Load pre-trained model tokenizer
tokenizer = GPT2Tokenizer.from_pretrained(model_name_or_path)
Load pre-trained model
model = GPT2LMHeadModel.from_pretrained(model_name_or_path)
Model input
input_text = "Hello, how are you?"
Encode input text
input_ids = tokenizer.encode(input_text, return_tensors='pt')
Generate output
output = model.generate(input_ids, max_length=50, num_return_sequences=1, temperature=0.7)
Decode output
decoded_output = tokenizer.decode(output[0], skip_special_tokens=True)
print(decoded_output)
License: Apache 2.0 The Apache 2.0 license allows software developers to alter the source code of existing software's source code, copy the original source code or update the source code. Furthermore, developers can then distribute any copies or modifications that they make of the software's source code.
COMMERCIAL USE: YES PERSONAL USE: YES EDUCATIONAL USE: YES
Enjoy!
- Downloads last month
- 14