Edit model card

tiny-lm

This repository provides a tiny 16M parameters language model for debugging and testing purposes. This is created by tuning sbintuitions/tiny-lm with oasset1 datasets in Japanese and English.

How to use

from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
 
model = AutoModelForCausalLM.from_pretrained("sbintuitions/tiny-lm-chat", torch_dtype="auto")
tokenizer = AutoTokenizer.from_pretrained("sbintuitions/tiny-lm-chat", use_fast=False)
generator = pipeline("text-generation", model=model, tokenizer=tokenizer)

prompt = tokenizer.apply_chat_template([{"role": "user", "content": "Hello!"}], add_generation_prompt=True, tokenize=False)
print(generator(prompt, max_length=30, do_sample=True, top_k=100))

Model architecture

A 4-layer, 512-hidden-size transformer-based language model.

Training

The model was first pre-trained on English Wikipedia and Japanese Wikipedia to optimize a traditional language modelling objective for 25B tokens. And then it was fine-tuned on oasst1 datasets in Japanese and English for 15 epochs.

License

MIT License

Downloads last month
1,030
Inference Examples
Inference API (serverless) is not available, repository is disabled.

Datasets used to train sbintuitions/tiny-lm-chat