Update README.md

c3c19c5 verified about 2 months ago

10.1 kB

	---
	license: cc-by-4.0
	language:
	- en
	base_model:
	- google/gemma-2-2b
	datasets:
	- ktiyab/sustainable-fashion
	tags:
	- retail
	- sustainability
	- fashion
	- timeless
	- wardrobe
	- eco-concious
	- style
	- gemma
	library_name: adapter-transformers
	pipeline_tag: text-generation
	---

	# Model Summary

	Model Name: `sustainable-fashion-gemma2-2b-202501080`
	Model Description: This model is a specialized variant of the Gemma-2 2B, fine-tuned to consistently deliver robust sustainable fashion advice, practical capsule wardrobe guidance, and timeless styling recommendations. By integrating LoRA techniques, the model efficiently captures nuanced, real-world fashion insights without sacrificing inference performance.

	### Core Criteria

	1. Conciseness & Directness
	- Offers clear, actionable fashion tips without unnecessary complexity.

	2. Personalization
	- Tailors advice to individual budgets, lifestyles, and style preferences.

	3. Integration of Concepts
	- Connects sustainability principles, budget constraints, and style guidelines into a unified approach.

	4. Tone & Accessibility
	- Maintains a friendly, approachable voice—ideal for newcomers and seasoned eco-conscious dressers alike.

	5. Strategic Focus
	- Emphasizes long-term wardrobe value, cost-benefit analyses, and ecological impact in every recommendation.

	6. Practical Reality
	- Balances high-quality investments with realistic budgeting, mixing accessible pieces with sustainable choices.

	### Architecture & Training Highlights
	- Base Model: Gemma-2 2B parameters
	- Fine-Tuning Method: LoRA-based instruct tuning (`--task=instruct-lora`)
	- Epochs: 5
	- Learning Rate: 0.0002, using a cosine scheduler
	- Optimizer: `paged_adamw_32bit`
	- Attention Implementation: `flash_attention_2`, improving speed and memory usage
	- Precision: 4-bit quantization for memory efficiency (`--precision_mode=4bit`)
	- Training Data: Synthetic Q&A pairs on sustainable and timeless fashion (see [https://www.kaggle.com/datasets/tiyabk/sustainable-fashion/data](#data-overview))

	### Primary Use
	The model excels at answering sustainable and timelesse fashion questions on:

	Eco-friendly fabrics
	- Fabric Recommendations: Users inquire about the best sustainable materials (e.g., organic cotton, linen, hemp, Tencel) for different climates or occasions.
	- Material Care Guidance: Advice on proper care (washing, drying, storing) to extend the life of garments made with eco-friendly fabrics.

	Capsule wardrobe construction
	- Core Wardrobe Planning: Helps users identify essential clothing items that suit their lifestyle (e.g., a few versatile tops, bottom pieces, layering pieces).
	- Minimalist Shopping Lists: Suggests how many items to include in each category (tops, pants, outerwear, etc.) to maintain both variety and wearability.
	- Seasonal Transitions: Guides on switching out certain pieces for seasonal changes while keeping the majority of items year-round.

	Climate-specific styling
	- Weather-Appropriate Outfits: Users input local climate details (hot and humid vs. cold and dry), and the model advises on layering, fabric weights, and silhouettes.
	- Travel Packing Assistance: For trips to different climates, the model helps curate a light but versatile wardrobe that can handle temperature and weather variations.
	- Regional Seasonality: Accounts for monsoon seasons, harsh winters, or desert climates when recommending outfit structures, color choices, and fabric blends.

	Timeless color palettes and silhouettes
	- Color Palette Selection: Helps users identify a cohesive range of neutrals, accent colors, and core hues that - suit their skin tone and personal style.
	- Long-Lasting Trends: Suggests classic cuts (e.g., A-line skirts, tailored trousers, button-up shirts) that transcend seasonal fads.
	- Personalization: Balances timelessness with individual flair (e.g., a signature color or pattern) without sacrificing longevity.

	### Evaluation
	The model was evaluated on an internal validation set using built-in metrics (e.g., loss). The model has demonstrated robust understanding and consistent instruction-following behavior for fashion-related queries.

	---

	## Usage

	Below is a minimal code snippet showing how to load and use the model for inference. This example assumes the merged model is available in a storage bucket or local path.

	```bash
	!pip install transformers accelerate huggingface_hub
	```

	```python
	import torch
	from transformers import AutoTokenizer, AutoModelForCausalLM
	from huggingface_hub import notebook_login

	# If your model is private, uncomment:
	# notebook_login()

	# 1) Set up the device
	device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

	# 2) Load tokenizer & model from Hugging Face
	model_path = "YourUsername/my-awesome-finetuned-model"
	tokenizer = AutoTokenizer.from_pretrained(model_path)
	model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=torch.float16)
	model.to(device)

	# 3) Define your generation functions
	def ask_fashion_model_short_response(question):
	inputs = tokenizer(question, return_tensors="pt").to(device)
	with torch.no_grad():
	outputs = model.generate(
	**inputs,
	max_new_tokens=200,
	do_sample=True,
	temperature=0.7,
	top_k=50,
	top_p=0.9,
	repetition_penalty=1.2,
	num_beams=1
	)
	return tokenizer.decode(outputs[0], skip_special_tokens=True)

	def ask_fashion_model_long_response(question):
	inputs = tokenizer(question, return_tensors="pt").to(device)
	with torch.no_grad():
	outputs = model.generate(
	**inputs,
	max_new_tokens=512,
	num_beams=2,
	length_penalty=1.8,
	temperature=1,
	top_k=50,
	top_p=1,
	do_sample=False
	)
	return tokenizer.decode(outputs[0], skip_special_tokens=True)

	# 4) Example usage
	query = "What are some timeless ways to incorporate color into my wardrobe if I usually stick to neutral shades?"
	response = ask_fashion_model_long_response(query)
	print(response)
	```

	### Input / Output Shape
	- Input: A user prompt or query in natural language (e.g., “How do I choose the right colors for my skin tone?”).
	- Output: A single, text-based answer or instruction.

	### Known and Preventable Failures
	- Ambiguous Requests: Vague queries may yield broad responses. Encourage users to provide context.
	- Non-Fashion Queries: The model is specialized for sustainable fashion tasks and might produce less accurate or irrelevant answers for unrelated domains.

	---

	## System

	This model can be used as a standalone text-generation system or integrated into broader chat/assistant pipelines:
	- Input Requirements: A plain text prompt describing a fashion or styling question.
	- Downstream Dependencies: Any application expecting textual recommendations (e.g., e-commerce chatbots, personal stylist apps, content generation tools).

	The model’s text output could feed into:
	- E-commerce: Product recommendations for sustainability suggestions.
	- Editorial/Content: Generating blog articles or social media posts on sustainability fashion topics.

	---

	## Implementation Requirements

	- Training Hardware:
	- Machine Type: `a2-ultragpu-8g` with 8 NVIDIA A100 80GB GPUs
	- Disk: 500 GB SSD (`pd-ssd`)
	- Replica Count: 1 (single worker)

	- Software:
	- Container image: `us-docker.pkg.dev/vertex-ai-restricted/vertex-vision-model-garden-dockers/pytorch-peft-train:stable_20240909`
	- Key packages: PyTorch, Transformers, Deepspeed (Zero 2 config), LoRA libraries

	- Compute Requirements:
	Training with 8 GPUs over 5 epochs on a dataset of fashion Q&A pairs. Time to convergence depends on final dataset size and batch configuration (`per_device_train_batch_size=1` with gradient accumulation steps).

	- Performance & Energy Consumption:
	The usage of flash attention and 4-bit precision helps reduce memory usage and training costs. No exact energy consumption metrics are published, but overhead is significantly reduced compared to full FP16 or FP32 training.

	---

	# Model Characteristics

	## Model Initialization

	The model was fine-tuned from a pre-trained Gemma-2 2B parameter LLM. It was not trained from scratch.

	## Model Stats

	- Parameter Count: ~2B base parameters, plus LoRA adapters
	- Model Size (disk): ~5GB
	- Layers: The base Gemma-2 architecture with multi-headed self-attention and transformer blocks.
	- Latency: Inference latency depends on GPU/CPU hardware. On a single GPU, the flash attention significantly speeds up token generation compared to naive implementations.

	## Other Details

	- Pruning: Not applied.
	- Quantization: 4-bit quantization used for training to reduce memory footprint.
	- Differential Privacy: No specialized techniques implemented; the dataset is synthetic with no direct PII.

	---

	# Data Overview

	## Training Data

	Source: A synthetic collection of 38K lines Q&A pairs about sustainable fashion, generated via advanced prompt engineering to cover:
	- Sustainable practices (e.g., recycled materials, secondhand shopping)
	- Wardrobe fundamentals (e.g., neutral color palettes, timeless silhouettes)
	- Climate-specific styling
	- Budget constraints
	...

	Pre-processing: Minimal text cleaning (e.g., removing extraneous symbols), focusing on clarity and consistency.

	## Demographic Groups

	The data does not explicitly classify demographics. Rather, it addresses general styling contexts (body changes, climate differences, budget considerations) that can be relevant across diverse populations.

	## Evaluation Data

	- Train / Test / Dev Splits: Partitioned from the same synthetic source:
	- Train: Main dataset (`train` split)
	- Evaluation: Sustainable Fashion Eval
	- Differences: All synthetic, but evaluation focuses on multi-turn instructions, ensuring coverage of real-world complexities.

	---

	# Evaluation Results

	## Summary

	The model’s performance was measured via:
	- Loss on the held-out evaluation set every few steps (`--eval_steps=5`)
	- Qualitative checks for correctness, clarity, and coherence

	Results indicated:
	- Low perplexity on domain-specific