Edit model card

Overview

This is a smaller GPT2 model trained on MC4 Sinhala dataset. As Sinhala is one of those low resource languages, there are only a handful of models been trained. So, this would be a great place to start training for more downstream tasks.

Model Specification

The model chosen for training is GPT2 with the following specifications:

  1. vocab_size=50257
  2. n_embd=768
  3. n_head=12
  4. n_layer=12
  5. n_positions=1024

How to Use

You can use this model directly with a pipeline for casual language modeling:

from transformers import pipeline
generator = pipeline('text-generation', model='keshan/sinhala-gpt2')
generator("මම", max_length=50, num_return_sequences=5)
Downloads last month
29
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train keshan/sinhala-gpt2