metadata
language:
- tr
library_name: transformers
datasets:
- merve/turkish_instructions
pipeline_tag: text-generation
Model Card for Model ID
This model is a fine-tuned version of YTU's Cosmos GPT2 Language Model. You can check the code from here:Fine Tuning Cosmos by LoRA and QLoRA
Training Details
The model was fine-tuned using LoRA and QLoRA techniques. Training parameters are defined below.
LoRA configs:
- r=16
- lora_alpha=32
- target_modules=c_proj,c_fc, gate_proj, c_proj, c_attn
- lora_dropout=0.05
- bias="lora_only"
- fan_in_fan_out=True
- max_seq_length=512
- use_rslora=True
Train Parameters:
- train_epochs=5
- optim="paged_lion_8bit"
- learning_rate=2e-4
- warmup_ratio=0.03
- max_grad_norm=0.3
- lr_scheduler_type="linear"
Training Data
For training, I used Merve's Turkish Instructions Dataset, which you can check here: Merve's Turkish Instructions Dataset
Instruction template:
def format_instruction(sample):
return f"""Sen cevap vermeyi seven yardımcı bir dil modelisin.
### Input:
{sample["talimat"]}
### Context:
{sample[" giriş"]}
### Response:
{sample[" çıktı"]}
"""
Generate Output:
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
model_id = "ardaorcun/finetuned_cosmos2603"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id,
device_map='auto',
load_in_8bit=True)
sampling_params = dict(do_sample=True, temperature=0.3, top_k=50, top_p=0.9)
pipe = pipeline("text-generation",
model=model,
tokenizer=tokenizer,
device_map="auto",
max_new_tokens=512,
return_full_text=True,
repetition_penalty=1.1
)
DEFAULT_SYSTEM_PROMPT = "Sen cevap vermeyi seven yardımcı bir dil modelisin.\n"
def format_instruction(sample):
return f"""{DEFAULT_SYSTEM_PROMPT}
### Input:
{sample["talimat"]}
### Context:
{sample["giriş"]}
### Response:
{sample["çıktı"]}"""
Create Answer:
prompt = "your_prompt"
girdi = "your_entry"
instruction = f"""Sen cevap vermeyi seven yardımcı bir dil modelisin.\n### Input:\n{prompt}\n\n### Context:\n{girdi}\n\n### Response:"""
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, max_length = 2048)
result = pipe(instruction)
print(result[0]['generated_text'][len(instruction):])