pawa-min-alpha / README.md

Update README.md

9b01814 verified 3 months ago

3.82 kB

	---
	language:
	- sw
	- en
	---

	# PAWA: Swahili SML for Various Tasks

	---

	## Overview

	PAWA is a Swahili-specialized language model designed to excel in tasks requiring nuanced understanding and interaction in Swahili and English. It leverages supervised fine-tuning (SFT) and Direct Policy Optimization (DPO) for improved performance and consistency. Below are the detailed model specifications, installation steps, usage examples, and its intended applications.

	---
	### Model Details

	- Model Name: Pawa-mini-V0.1
	- Model Type: PAWA
	- Architecture:
	- 2B Parameter Gemma-2 Base Model
	- Enhanced with Swahili SFT and DPO datasets.
	- Languages Supported:
	- Swahili
	- English
	- Custom tokenizer for multi-language flexibility.
	- Primary Use Cases:
	- Contextually rich Swahili-focused tasks.
	- General assistance and chat-based interactions.
	- License: Custom/Contact Author for terms of use.

	---
	### Installation and Setup
	Ensure the necessary libraries are installed and up-to-date:

	```bash
	!pip uninstall transformers -y && pip install --upgrade --no-cache-dir "git+https://github.com/huggingface/transformers.git"
	!pip uninstall unsloth -y && pip install --upgrade --no-cache-dir "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
	!pip install datasets
	```
	---
	### Model Loading
	You can load the model using the following code snippet:

	```python
	from unsloth import FastLanguageModel
	import torch

	model_name = "sartifyllc/Pawa-mini-V0.1"
	max_seq_length = 2048
	dtype = None
	load_in_4bit = False

	model, tokenizer = FastLanguageModel.from_pretrained(
	model_name=model_name,
	max_seq_length=max_seq_length,
	dtype=dtype,
	load_in_4bit=load_in_4bit,
	)
	```

	---
	### Chat Template Configuration
	For a seamless conversational experience, configure the tokenizer with the appropriate chat template:
	```python
	from unsloth.chat_templates import get_chat_template
	FastLanguageModel.for_inference(model) # Enable native 2x faster inference

	tokenizer = get_chat_template(
	tokenizer,
	chat_template="chatml", # Supports templates like zephyr, chatml, mistral, etc.
	mapping={"role": "from", "content": "value", "user": "human", "assistant": "gpt"}, # ShareGPT style
	map_eos_token=True, # Maps <\|im_end\|> to </s>
	)
	```
	---
	### Usage Example
	Generate a short story in Swahili:

	```python
	messages = [{"from": "human", "value": "Tengeneza hadithi fupi"}]
	inputs = tokenizer.apply_chat_template(
	messages,
	tokenize=True,
	add_generation_prompt=True,
	return_tensors="pt",
	).to("cuda")

	from transformers import TextStreamer
	text_streamer = TextStreamer(tokenizer)
	_ = model.generate(input_ids=inputs, streamer=text_streamer, max_new_tokens=128, use_cache=True)
	```
	---
	### Training and Fine-Tuning Details

	- Base Model: Gemma-2-2B
	- Continue Pre-Training: 3B Swahili Tokens
	- Fine-tuning: Enhanced with Swahili SFT datasets for improved contextual understanding.
	- Optimization: Includes DPO for deterministic and consistent responses.

	---

	### Intended Use Cases

	- General Assistance:
	Provides structured answers for general-purpose use.

	- Interactive Q&A:
	Designed for general-purpose chat environments.

	- RAG (Retrieval-Augmented Generation):
	Works best for RAG and specific use cases.

	---
	### Limitations

	- Biases:
	The model may exhibit biases inherent in its fine-tuning datasets.

	- Generalization:
	May struggle with tasks outside the trained domain.

	- Hardware Requirements:
	- Optimal performance requires GPUs with high memory (e.g., Tesla V100 or T4).
	- Supports 4-bit quantization for reduced memory usage.


	Feel free to reach out for further guidance or collaboration opportunities regarding PAWA!