Model Card for Finetuned OPT-350M Chatbot Model

Model Details

Model Description

This is a chat fine-tuned version of facebook/opt-350m, designed to provide chatbot-like responses using instruction fine-tuning techniques. The goal of this tuning was to to convert a Base Model to Chat Model using Instruction Finetuning.

  • Developed by: Sartaj
  • Finetuned from model: facebook/opt-350m
  • Language(s): English
  • License: apache-2.0
  • Framework: Hugging Face Transformers

Model Sources

Uses

Model can be used to generate basic code and further finetuned to refine code generation.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_name = "sartajbhuvaji/facebook-opt-350m-chat"
model = AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

device = "cuda" if torch.cuda.is_available() else "cpu"
model.to(device)

def generate_response(question):
    input_prompt = f"### Question: {question}\n ### Answer:"
    inputs = tokenizer(input_prompt, return_tensors="pt").to(device)

    # Generate output using the model
    outputs = model.generate(
        inputs["input_ids"],
        max_length=500, 
        num_beams=5,  
        temperature=0.7, 
        eos_token_id=tokenizer.eos_token_id,
        early_stopping=True,
    )

    generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
    return generated_text

question = "Write a Python program to add two numbers."
response = generate_response(question)
print(response)

'''
### Question: Write a Python program to add two numbers.
 ### Answer: def add_two_numbers(a, b):
    return a + b 
'''

Downstream Use

  • Code Geneation
  • Fine Tuning

Training Details

Training Data

Training Procedure

  • Full Model Finetune
  • Epochs : 3

Preprocessing

  • Pre Processed data to follow template: ### Question: {quesion}\n ### Answer: {ansewer} {tokenizer.eos_token}
def formatting_prompts_func(example):
    output_texts = []
    for i in range(len(example['instruction'])):
        text = f"### Question: {example['instruction'][i]}\n ### Answer: {example['output'][i]} {tokenizer.eos_token}"
        output_texts.append(text)
    return output_texts

Training Loss

image/png

Trainer

  • global_step: 7509
  • training_loss: 0.9127310856885068
  • train_runtime: 2485.7984
  • train_samples_per_second: 24.164
  • train_steps_per_second: 3.021
  • total_flos: 2.939309944327373e+16
  • train_loss: 0.9127310856885068
  • epoch: 3.0

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

  • Hardware Type: NVIDIA A100 40 GB
  • Hours used: ~10
  • Cloud Provider: jetstream2
  • Compute Region: USA
  • Carbon Emitted: 2.24 Kg
Downloads last month
18
Safetensors
Model size
331M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for sartajbhuvaji/facebook-opt-350m-chat

Base model

facebook/opt-350m
Finetuned
(115)
this model

Dataset used to train sartajbhuvaji/facebook-opt-350m-chat