File size: 2,290 Bytes
a314a71
7326991
 
a314a71
 
7326991
a314a71
7326991
a314a71
 
 
 
7326991
 
 
a314a71
 
 
7326991
a314a71
7326991
 
 
 
a314a71
 
7326991
 
 
 
 
a314a71
7326991
 
a314a71
7326991
a314a71
7326991
a314a71
7326991
 
 
 
 
 
 
 
 
 
a314a71
7326991
 
a314a71
7326991
 
 
 
 
 
a314a71
7326991
6f4b6a4
7326991
 
a314a71
 
 
7326991
a314a71
7326991
 
 
 
 
 
 
 
 
 
 
a314a71
 
7326991
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
---
base_model: meta-llama/Meta-Llama-3-8B-Instruct
library_name: peft
---

# MISHANM/Kashmiri_text_generation_Llama3_8B_instruct

This model is fine-tuned for the Kashmiri language, capable of answering queries and translating text Between English and Kashmiri . It leverages advanced natural language processing techniques to provide accurate and context-aware responses.



## Model Details
1. Language: Kashmiri 
2. Tasks: Question Answering, Translation (English to Kashmiri )
3. Base Model: meta-llama/Meta-Llama-3-8B-Instruct



# Training Details

The model is trained on approx 49K instruction samples.
1. GPUs: 2*AMD Instinct™ MI210 Accelerators 
  
   


 ## Inference with HuggingFace
 ```python3
 
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

# Load the fine-tuned model and tokenizer
model_path = "MISHANM/Kashmiri_text_generation_Llama3_8B_instruct"

model = AutoModelForCausalLM.from_pretrained(model_path,device_map="auto")

tokenizer = AutoTokenizer.from_pretrained(model_path)

# Function to generate text
def generate_text(prompt, max_length=1000, temperature=0.9):
    # Format the prompt according to the chat template
    messages = [
        {
            "role": "system",
            "content": "You are a Kashmiri language expert and linguist, with same knowledge give response in Kashmiri language.",
        },
        {"role": "user", "content": prompt}
    ]

    # Apply the chat template
    formatted_prompt = f"<|system|>{messages[0]['content']}<|user|>{messages[1]['content']}<|assistant|>"

    # Tokenize and generate output
    inputs = tokenizer(formatted_prompt, return_tensors="pt")
    output = model.generate(  # Use model.module for DataParallel
        **inputs, max_new_tokens=max_length, temperature=temperature, do_sample=True
    )
    return tokenizer.decode(output[0], skip_special_tokens=True)

# Example usage
prompt = """Give a poem on LLM ."""
translated_text = generate_text(prompt)
print(translated_text)



```

## Citation Information
```
@misc{MISHANM/Kashmiri_text_generation_Llama3_8B_instruct,
  author = {Mishan Maurya},
  title = {Introducing Fine Tuned LLM for Kashmiri Language},
  year = {2024},
  publisher = {Hugging Face},
  journal = {Hugging Face repository},
  
}
```


- PEFT 0.12.0