GPT-2

GPT-2, a language pretrained model with a causal language modeling (CLM) goal, is a transformer-based language model. This model was pre-trained and used to generate text on the Vietnamese Wikilingua dataset.

How to use the model

from transformers import GPT2Tokenizer, GPT2LMHeadModel

tokenizer = GPT2Tokenizer.from_pretrained('minhtoan/vietnamese-gpt2-finetune')
model = GPT2LMHeadModel.from_pretrained('minhtoan/vietnamese-gpt2-finetune')

text = "Không phải tất cả các nguyên liệu lành mạnh đều đắt đỏ."
input_ids = tokenizer.encode(text, return_tensors='pt')
max_length = 100

sample_outputs = model.generate(input_ids,pad_token_id=tokenizer.eos_token_id,
                                   do_sample=True,
                                   max_length=max_length,
                                   min_length=max_length,
                                   num_return_sequences=3)

for i, sample_output in enumerate(sample_outputs):
    print(">> Generated text {}\n\n{}".format(i+1, tokenizer.decode(sample_output.tolist())))
    print('\n---')

Author

Phan Minh Toan

Downloads last month
43
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.