---
title: README
emoji: 🐨
colorFrom: blue
colorTo: red
sdk: static
pinned: false
---
# 🔍 OverseerAI

## Mission
OverseerAI is dedicated to advancing open-source AI safety and content moderation tools. We develop state-of-the-art models and datasets for brand safety classification, making content moderation more accessible and efficient for developers and organizations.

## 🌟 Our Projects

### Datasets
#### [BrandSafe-16k](https://huggingface.co/datasets/OverseerAI/BrandSafe-16k)
A comprehensive dataset for training brand safety classification models, featuring 16 distinct risk categories:

| Category | Description |
|----------|-------------|
| B1-PROFANITY | Explicit language and cursing |
| B2-OFFENSIVE_SLANG | Informal offensive terms |
| B3-COMPETITOR | Competitive brand mentions |
| B4-BRAND_CRITICISM | Negative brand commentary |
| B5-MISLEADING | Deceptive or false information |
| B6-POLITICAL | Political content and discussions |
| B7-RELIGIOUS | Religious themes and references |
| B8-CONTROVERSIAL | Contentious topics |
| B9-ADULT | Adult or mature content |
| B10-VIOLENCE | Violent themes or descriptions |
| B11-SUBSTANCE | Drug and alcohol references |
| B12-HATE | Hate speech and discrimination |
| B13-STEREOTYPE | Stereotypical content |
| B14-BIAS | Biased viewpoints |
| B15-UNPROFESSIONAL | Unprofessional content |
| B16-MANIPULATION | Manipulative content |

### Models

#### [vision-1](https://huggingface.co/OverseerAI/vision-1)
Our flagship model for brand safety classification:
- Architecture: Meta Llama 3.1 (15GB)
- Full precision model optimized for high accuracy
- Trained on BrandSafe-16k dataset
- Ideal for production deployments with high-end GPU resources

#### [vision-1-mini](https://huggingface.co/OverseerAI/vision-1-mini)
A lightweight, optimized version of vision-1:
- Size: 4.58 GiB
- Architecture: Llama 3.1 8B
- Quantization: GGUF V3 (Q4_K)
- Optimized for Apple Silicon
- Fast load time: 3.27s
- Efficient memory usage: 4552.80 MiB CPU / 132.50 MiB Metal
- Perfect for local deployment and smaller compute resources

## 💡 Use Cases
- Content moderation for social media platforms
- Brand safety monitoring for advertising
- User-generated content filtering
- Real-time content classification
- Safe content recommendation systems

## 🤝 Contributing
We welcome contributions from the community! Whether it's:
- Improving model accuracy
- Expanding the dataset
- Optimizing for different hardware
- Adding new classification categories
- Reporting issues or suggesting improvements

## 📫 Contact
- GitHub: [OverseerAI](https://github.com/OverseerAI)
- HuggingFace: [OverseerAI](https://huggingface.co/OverseerAI)

## 📜 License
Our models are released under the Llama 3.1 license, and our datasets are available under open-source licenses to promote accessibility and innovation in AI safety.

---
*OverseerAI - Making AI Safety Accessible and Efficient*