Sentiment Analysis Model (Vibescribe)

Vibescribe built with Hugging Face Transformers, fine-tuned on IMDB reviews.

Setup

  1. Clone the repository:
git clone https://github.com/your-username/sentiment-analysis
cd sentiment-analysis
  1. Create virtual environment:
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
  1. Install dependencies:
pip install -r requirements.txt
  1. Log in to Hugging Face:
huggingface-cli login

Project Structure

sentiment-analysis/
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ train.py
β”œβ”€β”€ inference.py
β”œβ”€β”€ utils.py
└── README.md

Files to Create

requirements.txt

transformers==4.37.2
datasets==2.16.1
torch==2.1.2
scikit-learn==1.4.0

utils.py

from sklearn.metrics import accuracy_score, precision_recall_fscore_support

def compute_metrics(pred):
    labels = pred.label_ids
    preds = pred.predictions.argmax(-1)
    precision, recall, f1, _ = precision_recall_fscore_support(labels, preds, average='binary')
    return {
        'accuracy': accuracy_score(labels, preds),
        'f1': f1,
        'precision': precision,
        'recall': recall
    }

inference.py

from transformers import pipeline

def load_model(model_path):
    return pipeline("sentiment-analysis", model=model_path)

def predict(classifier, text):
    return classifier(text)

if __name__ == "__main__":
    model_path = "your-username/sentiment-analysis-model"
    classifier = load_model(model_path)
    
    # Example prediction
    text = "This movie was really great!"
    result = predict(classifier, text)
    print(f"Text: {text}\nSentiment: {result}")

Training

  1. Update model configuration in train.py:
training_args = TrainingArguments(
    output_dir="sentiment-analysis-model",
    hub_model_id="your-username/sentiment-analysis-model",  # Change this
    ...
)
  1. Start training:
python train.py

Making Predictions

from inference import load_model, predict

classifier = load_model("your-username/sentiment-analysis-model")
result = predict(classifier, "Your text here")

Model Details

  • Base model: DistilBERT
  • Dataset: IMDB Reviews
  • Task: Binary sentiment classification (positive/negative)
  • Training time: ~2-3 hours on GPU
  • Model size: ~260MB

Performance Metrics

  • Accuracy: ~91-93%
  • F1 Score: ~91-92%
  • Precision: ~90-91%
  • Recall: ~91-92%

Contributing

  1. Fork the repository
  2. Create feature branch
  3. Commit changes
  4. Push to branch
  5. Open pull request

License

MIT License

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.