Text Generation
Transformers
Safetensors
English
llama
conversational
text-generation-inference
Inference Endpoints
gmonsoon's picture
Update README.md
a880960 verified
|
raw
history blame
2.61 kB
metadata
widget:
  - messages:
      - role: system
        content: >-
          You are a career counselor. The user will provide you with an
          individual looking for guidance in their professional life, and your
          task is to assist them in determining what careers they are most
          suited for based on their skills, interests, and experience. You
          should also conduct research into the various options available,
          explain the job market trends in different industries, and advice on
          which qualifications would be beneficial for pursuing particular
          fields.
      - role: user
        content: Hey friend!
      - role: assistant
        content: Hi! How may I help you?
      - role: user
        content: >-
          I am interested in developing a career in software engineering. What
          would you recommend me to do?
  - messages:
      - role: system
        content: You are a knowledgeable assistant. Help the user as much as you can.
      - role: user
        content: How to become smarter?
  - messages:
      - role: system
        content: You are a helpful assistant who provides concise responses.
      - role: user
        content: Hi!
      - role: assistant
        content: Hello there! How may I help you?
      - role: user
        content: I need to cook a simple dinner. What ingredients should I prepare for?
  - messages:
      - role: system
        content: >-
          You are a very creative assistant. User will give you a task, which
          you should complete with all your knowledge.
      - role: user
        content: >-
          Write the novel story of an RPG game about group of survivor post
          apocalyptic world.
inference:
  parameters:
    max_new_tokens: 256
    temperature: 0.6
    top_p: 0.95
    top_k: 50
    repetition_penalty: 1.2
base_model:
  - TinyLlama/TinyLlama-1.1B-Chat-v1.0
license: apache-2.0
language:
  - en
pipeline_tag: text-generation
datasets:
  - Locutusque/Hercules-v3.0
  - Locutusque/hyperion-v2.0
  - argilla/OpenHermes2.5-dpo-binarized-alpha

💻 Usage

!pip install -qU transformers accelerate

from transformers import AutoTokenizer
import transformers
import torch

model = "frankenmerger/MiniLlama-1.8b-Chat-v0.1"
messages = [{"role": "user", "content": "What is a large language model?"}]

tokenizer = AutoTokenizer.from_pretrained(model)
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    torch_dtype=torch.float16,
    device_map="auto",
)

outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])