TinyLlama-v2ray / README.md
Jeff man112
Update README.md
e4a5e20
|
raw
history blame
2.48 kB
metadata
license: apache-2.0
base_model: TinyLlama/TinyLlama-1.1B-Chat-v0.6
tags:
  - trl
  - sft
  - generated_from_trainer
model-index:
  - name: TinyLlama-v2ray
    results: []
datasets:
  - TheBossLevel123/v2ray
library_name: transformers
widget:
  - text: |-
      <|im_start|>user
      Who are you?<|im_end|>
      <|im_start|>assistant
    example_title: First Example
  - text: |-
      <|im_start|>user
      how much do you goon?<|im_end|>
      <|im_start|>assistant
    example_title: Second Example

TinyLlama-v2ray

This model is a fine-tuned version of TinyLlama/TinyLlama-1.1B-Chat-v0.6 on the TheBossLevel123/v2ray dataset.

Model description

Prompt format is as follows:

<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant

The model is intended to mimic the behavior of v2ray, so results will most likely be nonsensical or gibberish.

Example Usage

import torch
from transformers import pipeline, AutoTokenizer
import re
tokenizer = AutoTokenizer.from_pretrained("TheBossLevel123/TinyLlama-v2ray")
pipe = pipeline("text-generation", model="TheBossLevel123/TinyLlama-v2ray", torch_dtype=torch.bfloat16, device_map="auto")

def formatted_prompt(prompt)-> str:
    return f"<|im_start|>user\n{prompt}<|im_end|>\n<|im_start|>assistant"

def extract_text(text):
    pattern = r'v2ray\n(.*?)(?=<\|im_end\|>)'
    match = re.search(pattern, text, re.DOTALL)
    if match:
        return f"Output: {match.group(1)}"
    else:
        return "No match found"
prompt = 'what are your thoughts on ccp'
outputs = pipe(formatted_prompt(prompt), max_new_tokens=50, do_sample=True, temperature=0.9)
if outputs and "generated_text" in outputs[0]:
    text = extract_text(outputs[0]["generated_text"])
    print(f"Prompt: {prompt}")
    print("")
    print(text)
else:
    print("No output or unexpected structure")

#Prompt: what are ur thoughts on ccp
#
#Output: <Re: insaneness> you are a ccp

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.002
  • train_batch_size: 1
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 32
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • training_steps: 1000
  • mixed_precision_training: Native AMP

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.1.0+cu121
  • Datasets 2.16.0
  • Tokenizers 0.15.0