myrkur
/

shotor

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

shotor / README.md

myrkur's picture

Update README.md

b257e8b verified 6 months ago

|

history blame contribute delete

2.22 kB

	---
	license: apache-2.0
	language:
	- fa
	- en
	library_name: transformers
	pipeline_tag: text-generation
	datasets:
	- myrkur/persian-alpaca-deep-clean
	---

	# Shotor (Llama 3 8B Instruction Tuned on Farsi)

	<a href="https://ibb.co/PwCN3VF"><img src="https://i.ibb.co/0hJc8zm/shotor.png" alt="shotor" border="0"></a>


	Shotor is a Persian language model built upon the llama 3 8B architecture, a multilingual Large Language Model (LLM). It has been fine-tuned using supervised learning techniques and the Dora method for efficient fine-tuning. The model has been specifically tailored and trained on Persian datasets, particularly leveraging the dataset provided by [persian-alpaca-deep-clean](https://huggingface.co/datasets/myrkur/persian-alpaca-deep-clean).

	## Usage

	Here's a sample Python code snippet demonstrating how to use Shotor for text generation:

	```python
	import transformers
	import torch

	# Load the Shotor model
	model_id = "myrkur/shotor"
	pipeline = transformers.pipeline(
	"text-generation",
	model=model_id,
	model_kwargs={"torch_dtype": torch.bfloat16},
	device_map="auto",
	)

	# Define user messages
	messages = [
	{"role": "user", "content": "علم بهتر است یا ثروت؟"},
	]

	# Apply chat template and generate text
	prompt = pipeline.tokenizer.apply_chat_template(
	messages,
	tokenize=False,
	add_generation_prompt=True
	)

	terminators = [
	pipeline.tokenizer.eos_token_id,
	pipeline.tokenizer.convert_tokens_to_ids("<\|eot_id\|>")
	]

	outputs = pipeline(
	prompt,
	max_new_tokens=512,
	eos_token_id=terminators,
	do_sample=True,
	temperature=0.5,
	top_p=0.9,
	repetition_penalty=1.1
	)
	print(outputs[0]["generated_text"][len(prompt):])
	```

	## Contributions

	Contributions to Shotor are welcome! Whether it's enhancing the model's capabilities, improving its performance on specific tasks, or evaluating its performance, your contributions can help advance Persian natural language processing.

	## Contact
	For questions or further information, please contact:

	- Amir Masoud Ahmadi: [[email protected]](mailto:[email protected])
	- Sahar Mirzapour: [[email protected]](mailto:[email protected])