pengsu's picture
Update README.md
ed19aaa verified
|
raw
history blame
4.98 kB
---
base_model: google/gemma-2-2b-it
library_name: peft
tags:
- sentiment-analysis
- weighted-loss
- LoRA
- Korean
---
# Model Card for Fine-Tuned `gemma-2-2b-it` on Custom Korean Sentiment Dataset
## Model Summary
This model is a fine-tuned version of `google/gemma-2-2b-it`, trained to classify sentiment in Korean text into four categories: **๋ฌด๊ฐ์ •** (neutral), **์Šฌํ””** (sadness), **๊ธฐ์จ** (joy), and **๋ถ„๋…ธ** (anger). The model utilizes **LoRA (Low-Rank Adaptation)** for efficient fine-tuning and **4-bit quantization (NF4)** for memory efficiency using **BitsAndBytes**. A custom weighted loss function was applied to handle class imbalance within the dataset.
The model is suitable for multi-class sentiment classification in Korean and is optimized for environments with limited computational resources due to the quantization.
## Model Details
### Developed By:
This model was fine-tuned by [Your Name or Organization] using Hugging Face's `peft` and `transformers` libraries with a custom Korean sentiment dataset.
### Model Type:
This is a transformer-based model for **multi-class sentiment classification** in the Korean language.
### Language:
- **Language(s)**: Korean
### License:
[Add relevant license here]
### Finetuned From:
- **Base Model**: `google/gemma-2-2b-it`
### Framework Versions:
- **Transformers**: 4.44.2
- **PEFT**: 0.12.0
- **Datasets**: 3.0.1
- **PyTorch**: 2.4.1+cu121
## Intended Uses & Limitations
### Intended Use:
This model is suitable for applications requiring multi-class sentiment classification in Korean, such as chatbots, social media monitoring, or customer feedback analysis.
### Out-of-Scope Use:
The model may not perform optimally for tasks requiring multi-language support, sentiment classification with additional classes, or outside the specific context of Korean language data.
### Limitations:
- **Bias**: As the model is trained on a custom dataset, it may reflect specific biases inherent in that data.
- **Generalization**: Performance may vary when applied to datasets outside the scope of the initial training data, such as other forms of sentiment classification.
## Model Architecture
### Quantization:
The model uses **4-bit quantization** via **BitsAndBytes** for efficient memory usage, which enables it to run on lower-resource hardware.
### LoRA Configuration:
LoRA (Low-Rank Adaptation) was applied to specific transformer layers, allowing for parameter-efficient fine-tuning. The target modules include:
- `down_proj`, `gate_proj`, `q_proj`, `o_proj`, `up_proj`, `v_proj`, `k_proj`
LoRA parameters are:
- `r = 16`, `lora_alpha = 32`, `lora_dropout = 0.05`
### Custom Weighted Loss:
A custom weighted loss function was implemented to handle class imbalance, using the following weights:
\[
\text{weights} = [0.2032, 0.2704, 0.2529, 0.2735]
\]
These weights correspond to the classes: **๋ฌด๊ฐ์ •**, **์Šฌํ””**, **๊ธฐ์จ**, **๋ถ„๋…ธ**, respectively.
## Training Details
### Dataset:
The model was trained on a custom Korean sentiment analysis dataset. This dataset consists of text samples labeled with one of four sentiment classes: **๋ฌด๊ฐ์ •**, **์Šฌํ””**, **๊ธฐ์จ**, and **๋ถ„๋…ธ**.
- **Train Set Size**: Custom dataset
- **Test Set Size**: Custom dataset
- **Classes**: 4 (๋ฌด๊ฐ์ •, ์Šฌํ””, ๊ธฐ์จ, ๋ถ„๋…ธ)
### Preprocessing:
Data was tokenized using the `google/gemma-2-2b-it` tokenizer with a maximum sequence length of 128. The preprocessing steps included padding and truncation to ensure consistent input lengths.
### Hyperparameters:
- **Learning Rate**: 2e-4
- **Batch Size (train)**: 8
- **Batch Size (eval)**: 8
- **Epochs**: 4
- **Optimizer**: AdamW (with 8-bit optimization)
- **Weight Decay**: 0.01
- **Gradient Accumulation Steps**: 2
- **Evaluation Steps**: 500
- **Logging Steps**: 500
- **Metric for Best Model**: F1 (weighted)
## Evaluation
### Metrics:
The model was evaluated using the following metrics:
- **Accuracy**
- **F1 Score** (weighted)
- **Precision** (weighted)
- **Recall** (weighted)
The evaluation provides a detailed view of the model's performance across multiple metrics, which helps in understanding its strengths and areas for improvement.
### Code Example:
You can load the fine-tuned model and use it for inference on your own data as follows:
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
# Load model and tokenizer
model = AutoModelForSequenceClassification.from_pretrained("your-model-directory")
tokenizer = AutoTokenizer.from_pretrained("your-model-directory")
# Tokenize input text
text = "์ด ์˜ํ™”๋Š” ์ •๋ง ์Šฌํผ์š”."
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True)
# Get predictions
outputs = model(**inputs)
logits = outputs.logits
predicted_class = logits.argmax(-1).item()
# Map prediction to label
id2label = {0: "๋ฌด๊ฐ์ •", 1: "์Šฌํ””", 2: "๊ธฐ์จ", 3: "๋ถ„๋…ธ"}
print(f"Predicted sentiment: {id2label[predicted_class]}")