|
--- |
|
language: |
|
- en |
|
- de |
|
- fr |
|
- it |
|
- pt |
|
- hi |
|
license: llama3.1 |
|
library_name: transformers |
|
pipeline_tag: text-classification |
|
tags: |
|
- facebook |
|
- meta |
|
- pytorch |
|
- llama |
|
- brand-safety |
|
- classification |
|
model-index: |
|
- name: vision-1-mini |
|
results: |
|
- task: |
|
type: text-classification |
|
name: Brand Safety Classification |
|
metrics: |
|
- type: accuracy |
|
value: 0.95 |
|
name: Classification Accuracy |
|
datasets: |
|
- BrandSafe-16k |
|
metrics: |
|
- accuracy |
|
base_model: meta-llama/Llama-2-8b-chat |
|
model_size: "4.58 GiB" |
|
parameters: "8.03B" |
|
quantization: "GGUF V3" |
|
architectures: |
|
- LlamaForCausalLM |
|
model_parameters: |
|
block_count: 32 |
|
context_length: 131072 |
|
embedding_length: 4096 |
|
feed_forward_length: 14336 |
|
attention_heads: 32 |
|
kv_heads: 8 |
|
rope_freq_base: 500000 |
|
vocab_size: 128256 |
|
hardware: |
|
recommended: "Apple Silicon" |
|
memory: |
|
cpu_kv_cache: "992.00 MiB" |
|
metal_kv_cache: "32.00 MiB" |
|
metal_compute: "560.00 MiB" |
|
cpu_compute: "560.01 MiB" |
|
inference: |
|
load_time: "3.27s" |
|
device: "Metal (Apple M3 Pro)" |
|
memory_footprint: |
|
cpu: "4552.80 MiB" |
|
metal: "132.50 MiB" |
|
--- |
|
# vision-1-mini |
|
|
|
Vision-1-mini is an optimized 8B parameter model based on Llama 3.1, specifically designed for brand safety classification, trained on our [BrandSafe-16k](https://huggingface.co/datasets/OverseerAI/BrandSafe-16k) dataset. This model is particularly optimized for Apple Silicon devices and provides efficient, accurate brand safety assessments using the BrandSafe-16k classification system. |
|
|
|
## Model Details |
|
|
|
- **Model Type:** Brand Safety Classifier |
|
- **Base Model:** Meta Llama 3.1 8B Instruct |
|
- **Parameters:** 8.03 billion |
|
- **Architecture:** Llama |
|
- **Quantization:** Q4_K |
|
- **Size:** 4.58 GiB (4.89 BPW) |
|
- **License:** Llama 3.1 |
|
|
|
## Performance Metrics |
|
|
|
- **Load Time:** 3.27 seconds (on Apple M3 Pro) |
|
- **Memory Usage:** |
|
- CPU Buffer: 4552.80 MiB |
|
- Metal Buffer: 132.50 MiB |
|
- KV Cache: 1024.00 MiB (512.00 MiB K, 512.00 MiB V) |
|
- Compute Buffer: 560.00 MiB |
|
|
|
## Hardware Compatibility |
|
|
|
### Apple Silicon Optimizations |
|
- Optimized for Metal/MPS |
|
- Unified Memory Architecture support |
|
- SIMD group reduction and matrix multiplication optimizations |
|
- Efficient layer offloading (1/33 layers to GPU) |
|
|
|
### System Requirements |
|
- Recommended Memory: 12GB+ |
|
- GPU: Apple Silicon preferred (M1/M2/M3 series) |
|
- Storage: 5GB free space |
|
|
|
## Classification Categories |
|
|
|
The model classifies content into the following categories: |
|
1. B1-PROFANITY - Contains profane or vulgar language |
|
2. B2-OFFENSIVE_SLANG - Contains offensive slang or derogatory terms |
|
3. B3-COMPETITOR - Mentions or promotes competing brands |
|
4. B4-BRAND_CRITICISM - Contains criticism or negative feedback about brands |
|
5. B5-MISLEADING - Contains misleading or deceptive information |
|
6. B6-POLITICAL - Contains political content or bias |
|
7. B7-RELIGIOUS - Contains religious content or references |
|
8. B8-CONTROVERSIAL - Contains controversial topics or discussions |
|
9. B9-ADULT - Contains adult or mature content |
|
10. B10-VIOLENCE - Contains violent content or references |
|
11. B11-SUBSTANCE - Contains references to drugs, alcohol, or substances |
|
12. B12-HATE - Contains hate speech or discriminatory content |
|
13. B13-STEREOTYPE - Contains stereotypical representations |
|
14. B14-BIAS - Shows bias against groups or individuals |
|
15. B15-UNPROFESSIONAL - Contains unprofessional content or behavior |
|
16. B16-MANIPULATION - Contains manipulative content or tactics |
|
17. SAFE - Contains no brand safety concerns |
|
|
|
## Usage |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
# Load model |
|
model = AutoModelForCausalLM.from_pretrained("maxsonderby/vision-1-mini", |
|
device_map="auto", |
|
torch_dtype=torch.float16, |
|
low_cpu_mem_usage=True) |
|
tokenizer = AutoTokenizer.from_pretrained("maxsonderby/vision-1-mini") |
|
|
|
# Example usage |
|
text = "Your text here" |
|
inputs = tokenizer(text, return_tensors="pt").to(model.device) |
|
outputs = model.generate(**inputs, |
|
max_new_tokens=1, |
|
temperature=0.1, |
|
top_p=0.9) |
|
result = tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
``` |
|
|
|
## Model Architecture |
|
|
|
- **Attention Mechanism:** |
|
- Head Count: 32 |
|
- KV Head Count: 8 |
|
- Layer Count: 32 |
|
- Embedding Length: 4096 |
|
- Feed Forward Length: 14336 |
|
- Context Length: 2048 (optimized from 131072) |
|
- RoPE Base Frequency: 500000 |
|
- Dimension Count: 128 |
|
|
|
## Training & Fine-tuning |
|
|
|
This model is fine-tuned on brand safety classification tasks using the BrandSafe-16k dataset. The model uses an optimized context window of 2048 tokens and is configured for precise, deterministic outputs with: |
|
- Temperature: 0.1 |
|
- Top-p: 0.9 |
|
- Batch Size: 512 |
|
- Thread Count: 8 |
|
|
|
## Limitations |
|
|
|
- The model is optimized for shorter content classification (up to 2048 tokens) |
|
- Performance may vary on non-Apple Silicon hardware |
|
- The model focuses solely on brand safety classification and may not be suitable for other tasks |
|
- Classification accuracy may vary based on content complexity and context |
|
|
|
## Citation |
|
|
|
If you use this model in your research, please cite: |
|
``` |
|
@misc{vision-1-mini, |
|
author = {Max Sonderby}, |
|
title = {Vision-1-Mini: Optimized Brand Safety Classification Model}, |
|
year = {2024}, |
|
publisher = {Hugging Face}, |
|
journal = {Hugging Face Model Hub}, |
|
howpublished = {\url{https://huggingface.co/maxsonderby/vision-1-mini}} |
|
} |
|
``` |