[Model Information]
This is a fine-tuned version of
Apple/OpenELM
model series; created in hopes of testing the limitations ofOpenELM
architecture. And maybe it had something to do with Apple's instruct models not providing the inst format.This language model is trained on a total estimated sample size of
61k
lines of data without giving a system prompt.
[Model Usage]
Tokenizer is included in this repo, so you may use the model as any other model.
Model currently can handle up to
2048/T
maximum embeddings. This is the default limit imposed byApple
during the original training process.In the training process of the language model, I didn't use any moderation filtration's; so this model might generate unwanted surprises.
Please be aware that this model is trained on a tiny fraction of, what other models are trained on; as of
2024/5/12
, you may consider this model as a research-model(This might change, if I feel like continuing improvement)
.
[How to utilize the model to it's full capacity.]
- First you will need the basic dependencies that are required for operations, you may install it by running this command:
pip install -U transformers torch torchvision torchaudio accelerate
- Secondly you may run this code, and remember to replace the
What can you do.
example question with your own.
from accelerate import Accelerator
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
accelerator = Accelerator()
tokenizer = AutoTokenizer.from_pretrained(pretrained_model_name_or_path="VINUK/OpenELM_Instruct_272M_V1.0", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(pretrained_model_name_or_path="VINUK/OpenELM_Instruct_272M_V1.0", trust_remote_code=True, torch_dtype=torch.bfloat16)
model = accelerator.prepare_model(model=model, evaluation_mode=True)
with torch.no_grad():
inputs = tokenizer(text=f"[|=U=|]\nWhat can you do.\n[|=M=|]\n", return_tensors='pt').to(accelerator.device)
response = model.generate(inputs=inputs['input_ids'],
attention_mask=inputs['attention_mask'],
max_new_tokens=1024,
min_new_tokens=10,
do_sample=True,
top_p=0.95,
top_k=50,
temperature=0.6,
repetition_penalty=1.0,
use_cache=True,
pad_token_id=tokenizer.eos_token_id,
)
decoded = tokenizer.decode(response[:, inputs['input_ids'].shape[-1]:][0], skip_special_tokens=True)
print(decoded.replace('\\n', '\n'))
- Downloads last month
- 28