Logistic Regression Sentiment Analysis Model

This model is a Logistic Regression classifier trained on the TripAdvisor sentiment analysis dataset. It predicts the sentiment of hotel reviews on a 1-5 star scale. The model takes text input (hotel reviews) and outputs a sentiment rating from 1 to 5 stars.

Model Details

  • Model Type: Logistic Regression
  • Task: Sentiment Analysis
  • Input: A hotel review (text)
  • Output: Sentiment rating (1-5 stars)
  • Trained Dataset: nhull/tripadvisor-split-dataset-v2

Intended Use

This model is designed to classify hotel reviews based on their sentiment. It assigns a star rating between 1 and 5 to a review, indicating the sentiment expressed in the review.


The model will return a sentiment rating between 1 and 5 stars, where:

  • 1: Very bad
  • 2: Bad
  • 3: Neutral
  • 4: Good
  • 5: Very good

Dataset

The dataset used for training, validation, and testing is nhull/tripadvisor-split-dataset-v2. It consists of:

  • Training Set: 30,400 reviews
  • Validation Set: 1,600 reviews
  • Test Set: 8,000 reviews

All splits are balanced across five sentiment labels.


Test Performance

Model predicts too high on average by 0.44.

  • Test Accuracy: 61.05% on the test set.

  • Classification Report:

Label Precision Recall F1-score Support
1.0 0.70 0.73 0.71 1600
2.0 0.52 0.50 0.51 1600
3.0 0.57 0.54 0.55 1600
4.0 0.55 0.54 0.55 1600
5.0 0.71 0.74 0.72 1600
Accuracy - - 0.61 8000
Macro avg 0.61 0.61 0.61 8000
Weighted avg 0.61 0.61 0.61 8000
  • Confusion Matrix:
True \ Predicted 1 2 3 4 5
1 1165 384 41 3 7
2 432 805 315 31 17
3 61 314 857 311 57
4 3 48 264 870 415
5 6 10 32 365 1187

Files Included

  • validation_results_log_regression.csv: Contains correctly classified reviews with their real and predicted labels.

Limitations

  • The model performs well on extreme ratings (1 and 5 stars) but struggles with intermediate ratings (2, 3, and 4 stars).
  • The model was trained on the TripAdvisor dataset and may not generalize well to reviews from other sources or domains.
  • The model does not handle aspects like sarcasm or humor well, and shorter reviews may lead to less accurate predictions.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Examples
Unable to determine this model's library. Check the docs .

Dataset used to train nhull/logistic-regression-model

Space using nhull/logistic-regression-model 1

Collection including nhull/logistic-regression-model