YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
AI Text Detection Model
A Random Forest classifier that detects whether text is human-written or AI-generated (GPT/Deepseek).
Overview
- Task: Binary classification (Human vs AI text)
- Architecture: Random Forest with TF-IDF features
- Input: Text string
- Output: Classification label (Human/AI) with confidence score
Installation
# Clone the repository
!git clone https://huggingface.co/polygraf-ai/ai-text-detector-random-forest-supplementary
!cd ai-text-detector-random-forest-supplementary
# Install the package
!pip install -e .
# Install requirements
pip install -r requirements.txt
Usage
# Single text prediction
from inference import predict_text
text = "Your text here to analyze"
result = predict_text(text, model_path="model_artifacts")
print(result)
# Output format:
{
'label': 'Human-written', # or 'AI-generated'
'confidence': 0.85, # confidence score between 0 and 1
'probabilities': {
'Human-written': 0.85,
'AI-generated': 0.15
}
}
# Multiple texts
texts = [
"First text to analyze",
"Second text to analyze"
]
results = [predict_text(text) for text in texts]
Limitations
- Not suitable for high-level detection
- Should be used as a supplementary tool only
Training Data
Text samples from:
- Human writers
- GPT-4 outputs
- Deepseek Chat outputs
Metrics
- Accuracy: 0.87
- Precision: 0.87
- Recall: 0.84
- F1: 0.85