|
--- |
|
language: |
|
- tr |
|
library_name: transformers |
|
datasets: |
|
- merve/turkish_instructions |
|
pipeline_tag: text-generation |
|
--- |
|
|
|
# Model Card for Model ID |
|
|
|
This model is a fine-tuned version of YTU's Cosmos GPT2 Language Model. You can check the code from here:<a href="https://github.com/Stealeristaken/Entry-Mid-Level-AI-Projects/blob/main/Fine%20Tuning%20Cosmos%20by%20LoRA%20and%20QLoRA.ipynb">Fine Tuning Cosmos by LoRA and QLoRA</a> |
|
|
|
|
|
## Training Details |
|
|
|
The model was fine-tuned using LoRA and QLoRA techniques. Training parameters are defined below. |
|
|
|
### LoRA configs: |
|
|
|
- **r**=16 |
|
- **lora_alpha**=32 |
|
- **target_modules**=c_proj,c_fc, gate_proj, c_proj, c_attn |
|
- **lora_dropout**=0.05 |
|
- **bias**="lora_only" |
|
- **fan_in_fan_out**=True |
|
- **max_seq_length**=512 |
|
- **use_rslora**=True |
|
|
|
### Train Parameters: |
|
- **train_epochs**=5 |
|
- **optim**="paged_lion_8bit" |
|
- **learning_rate**=2e-4 |
|
- **warmup_ratio**=0.03 |
|
- **max_grad_norm**=0.3 |
|
- **lr_scheduler_type**="linear" |
|
|
|
### Training Data |
|
|
|
For training, I used Merve's Turkish Instructions Dataset, which you can check here: <a href="https://huggingface.co/datasets/merve/turkish_instructions">Merve's Turkish Instructions Dataset</a> |
|
|
|
|
|
## Instruction template: |
|
|
|
```python |
|
def format_instruction(sample): |
|
return f"""Sen cevap vermeyi seven yardımcı bir dil modelisin. |
|
### Input: |
|
{sample["talimat"]} |
|
|
|
### Context: |
|
{sample[" giriş"]} |
|
|
|
### Response: |
|
{sample[" çıktı"]} |
|
""" |
|
``` |
|
## Generate Output: |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline |
|
|
|
model_id = "ardaorcun/finetuned_cosmos2603" |
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
model = AutoModelForCausalLM.from_pretrained(model_id, |
|
device_map='auto', |
|
load_in_8bit=True) |
|
|
|
sampling_params = dict(do_sample=True, temperature=0.3, top_k=50, top_p=0.9) |
|
|
|
pipe = pipeline("text-generation", |
|
model=model, |
|
tokenizer=tokenizer, |
|
device_map="auto", |
|
max_new_tokens=512, |
|
return_full_text=True, |
|
repetition_penalty=1.1 |
|
) |
|
|
|
DEFAULT_SYSTEM_PROMPT = "Sen cevap vermeyi seven yardımcı bir dil modelisin.\n" |
|
|
|
def format_instruction(sample): |
|
return f"""{DEFAULT_SYSTEM_PROMPT} |
|
### Input: |
|
{sample["talimat"]} |
|
|
|
### Context: |
|
{sample["giriş"]} |
|
|
|
### Response: |
|
{sample["çıktı"]}""" |
|
``` |
|
|
|
# Create Answer: |
|
|
|
```python |
|
prompt = "your_prompt" |
|
girdi = "your_entry" |
|
instruction = f"""Sen cevap vermeyi seven yardımcı bir dil modelisin.\n### Input:\n{prompt}\n\n### Context:\n{girdi}\n\n### Response:""" |
|
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, max_length = 2048) |
|
result = pipe(instruction) |
|
print(result[0]['generated_text'][len(instruction):]) |
|
``` |
|
|
|
|
|
|
|
|