metadata

license: apache-2.0
datasets:
  - allenai/dolma

OLMo-Bitnet-1B

OLMo-Bitnet-1B is a 1B parameter model trained using the method described in The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits. The result of this is that all of the parameter weights take only the values -1, 0, or 1.

It was trained on a 60B subset of the Dolma dataset, so it is merely a research proof-of-concept to test out the methodolgy.

A separate training run was run with the exact same hyperparameters, but using standard fp16 weights. The comparison can be found in this wandb report.

Sample inference code

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline, TextStreamer

tokenizer = AutoTokenizer.from_pretrained("NousResearch/OLMo-Bitnet-1B")
model = AutoModelForCausalLM.from_pretrained("NousResearch/OLMo-Bitnet-1B",
    torch_dtype=torch.bfloat16, trust_remote_code=True, device_map="auto")

streamer = TextStreamer(tokenizer)
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, pad_token_id=tokenizer.eos_token_id,
    temperature=0.8, repetition_penalty=1.1, do_sample=True,streamer=streamer)
pipe("The capitol of Paris is",  max_new_tokens=256)

Training was performed using OLMo.