dev02chandan's picture
Update README.md
dad0479 verified
|
raw
history blame
2.25 kB
metadata
language:
  - en
library_name: transformers
pipeline_tag: question-answering
tags:
  - Finetuning

Model Card for vishanoberoi/Llama-2-7b-chat-hf-finedtuned-to-GGUF

This model is a fine-tuned version of Llama-2-Chat-7b on company-specific question-answers data. It is designed for efficient performance while maintaining high-quality output, suitable for conversational AI applications.

Model Details

It was finetuned using QLORA and PEFT. After fine-tuning, the adapters were merged with the base model and then quantized to GGUF.

Model Sources

Uses

This model is optimized for direct use in conversational AI, particularly for generating responses based on company-specific data. It can be utilized effectively in customer service bots, FAQ bots, and other applications where accurate and contextually relevant answers are required.

Usage notebook

https://colab.research.google.com/drive/1885wYoXeRjVjJzHqL9YXJr5ZjUQOSI-w?authuser=4#scrollTo=TZIoajzYYkrg

Example with ctransformers:

from ctransformers import AutoModelForCausalLM, AutoTokenizer

llm = AutoModelForCausalLM.from_pretrained("vishanoberoi/Llama-2-7b-chat-hf-finedtuned-to-GGUF", model_file="finetuned.gguf", model_type="llama", gpu_layers = 50, max_new_tokens = 2000, temperature = 0.2, top_k = 40, top_p = 0.6, context_length = 6000)

system_prompt = '''<<SYS>>
You are a useful bot
<</SYS>>

'''

user_prompt = "Tell me about your company"

# Combine system prompt with user prompt
full_prompt = f"{system_prompt}\n[INST]{user_prompt}[/INST]"

# Generate the response
response = llm(full_prompt)

# Print the response
print(response)