--- base_model: google/gemma-2-2b-it library_name: peft tags: - sentiment-analysis - weighted-loss - LoRA - Korean --- # Model Card for Fine-Tuned `gemma-2-2b-it` on Custom Korean Sentiment Dataset ## Model Summary This model is a fine-tuned version of `google/gemma-2-2b-it`, trained to classify sentiment in Korean text into four categories: **무감정** (neutral), **슬픔** (sadness), **기쁨** (joy), and **분노** (anger). The model utilizes **LoRA (Low-Rank Adaptation)** for efficient fine-tuning and **4-bit quantization (NF4)** for memory efficiency using **BitsAndBytes**. A custom weighted loss function was applied to handle class imbalance within the dataset. The model is suitable for multi-class sentiment classification in Korean and is optimized for environments with limited computational resources due to the quantization. ## Model Details ### Developed By: This model was fine-tuned by [Your Name or Organization] using Hugging Face's `peft` and `transformers` libraries with a custom Korean sentiment dataset. ### Model Type: This is a transformer-based model for **multi-class sentiment classification** in the Korean language. ### Language: - **Language(s)**: Korean ### License: [Add relevant license here] ### Finetuned From: - **Base Model**: `google/gemma-2-2b-it` ### Framework Versions: - **Transformers**: 4.44.2 - **PEFT**: 0.12.0 - **Datasets**: 3.0.1 - **PyTorch**: 2.4.1+cu121 ## Intended Uses & Limitations ### Intended Use: This model is suitable for applications requiring multi-class sentiment classification in Korean, such as chatbots, social media monitoring, or customer feedback analysis. ### Out-of-Scope Use: The model may not perform optimally for tasks requiring multi-language support, sentiment classification with additional classes, or outside the specific context of Korean language data. ### Limitations: - **Bias**: As the model is trained on a custom dataset, it may reflect specific biases inherent in that data. - **Generalization**: Performance may vary when applied to datasets outside the scope of the initial training data, such as other forms of sentiment classification. ## Model Architecture ### Quantization: The model uses **4-bit quantization** via **BitsAndBytes** for efficient memory usage, which enables it to run on lower-resource hardware. ### LoRA Configuration: LoRA (Low-Rank Adaptation) was applied to specific transformer layers, allowing for parameter-efficient fine-tuning. The target modules include: - `down_proj`, `gate_proj`, `q_proj`, `o_proj`, `up_proj`, `v_proj`, `k_proj` LoRA parameters are: - `r = 16`, `lora_alpha = 32`, `lora_dropout = 0.05` ### Custom Weighted Loss: A custom weighted loss function was implemented to handle class imbalance, using the following weights: \[ \text{weights} = [0.2032, 0.2704, 0.2529, 0.2735] \] These weights correspond to the classes: **무감정**, **슬픔**, **기쁨**, **분노**, respectively. ## Training Details ### Dataset: The model was trained on a custom Korean sentiment analysis dataset. This dataset consists of text samples labeled with one of four sentiment classes: **무감정**, **슬픔**, **기쁨**, and **분노**. - **Train Set Size**: Custom dataset - **Test Set Size**: Custom dataset - **Classes**: 4 (무감정, 슬픔, 기쁨, 분노) ### Preprocessing: Data was tokenized using the `google/gemma-2-2b-it` tokenizer with a maximum sequence length of 128. The preprocessing steps included padding and truncation to ensure consistent input lengths. ### Hyperparameters: - **Learning Rate**: 2e-4 - **Batch Size (train)**: 8 - **Batch Size (eval)**: 8 - **Epochs**: 4 - **Optimizer**: AdamW (with 8-bit optimization) - **Weight Decay**: 0.01 - **Gradient Accumulation Steps**: 2 - **Evaluation Steps**: 500 - **Logging Steps**: 500 - **Metric for Best Model**: F1 (weighted) ## Evaluation ### Metrics: The model was evaluated using the following metrics: - **Accuracy** - **F1 Score** (weighted) - **Precision** (weighted) - **Recall** (weighted) The evaluation provides a detailed view of the model's performance across multiple metrics, which helps in understanding its strengths and areas for improvement. ### Code Example: You can load the fine-tuned model and use it for inference on your own data as follows: ```python from transformers import AutoTokenizer, AutoModelForSequenceClassification # Load model and tokenizer model = AutoModelForSequenceClassification.from_pretrained("your-model-directory") tokenizer = AutoTokenizer.from_pretrained("your-model-directory") # Tokenize input text text = "이 영화는 정말 슬퍼요." inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True) # Get predictions outputs = model(**inputs) logits = outputs.logits predicted_class = logits.argmax(-1).item() # Map prediction to label id2label = {0: "무감정", 1: "슬픔", 2: "기쁨", 3: "분노"} print(f"Predicted sentiment: {id2label[predicted_class]}")