Logistic Regression Sentiment Analysis Model

This model is a Logistic Regression classifier trained on the TripAdvisor sentiment analysis dataset. It predicts the sentiment of hotel reviews on a 1-5 star scale. The model takes text input (hotel reviews) and outputs a sentiment rating from 1 to 5 stars.

Model Details

Model Type: Logistic Regression
Task: Sentiment Analysis
Input: A hotel review (text)
Output: Sentiment rating (1-5 stars)
Trained Dataset: nhull/tripadvisor-split-dataset-v2

Intended Use

This model is designed to classify hotel reviews based on their sentiment. It assigns a star rating between 1 and 5 to a review, indicating the sentiment expressed in the review.

The model will return a sentiment rating between 1 and 5 stars, where:

1: Very bad
2: Bad
3: Neutral
4: Good
5: Very good

Dataset

The dataset used for training, validation, and testing is nhull/tripadvisor-split-dataset-v2. It consists of:

Training Set: 30,400 reviews
Validation Set: 1,600 reviews
Test Set: 8,000 reviews

All splits are balanced across five sentiment labels.

Test Performance

Model predicts too high on average by 0.44.

Test Accuracy: 61.05% on the test set.
Classification Report:

Label	Precision	Recall	F1-score	Support
1.0	0.70	0.73	0.71	1600
2.0	0.52	0.50	0.51	1600
3.0	0.57	0.54	0.55	1600
4.0	0.55	0.54	0.55	1600
5.0	0.71	0.74	0.72	1600
Accuracy	-	-	0.61	8000
Macro avg	0.61	0.61	0.61	8000
Weighted avg	0.61	0.61	0.61	8000

Confusion Matrix:

True \ Predicted	1	2	3	4	5
1	1165	384	41	3	7
2	432	805	315	31	17
3	61	314	857	311	57
4	3	48	264	870	415
5	6	10	32	365	1187

Files Included

validation_results_log_regression.csv: Contains correctly classified reviews with their real and predicted labels.

Limitations

The model performs well on extreme ratings (1 and 5 stars) but struggles with intermediate ratings (2, 3, and 4 stars).
The model was trained on the TripAdvisor dataset and may not generalize well to reviews from other sources or domains.
The model does not handle aspects like sarcasm or humor well, and shorter reviews may lead to less accurate predictions.

nhull
/

logistic-regression-model

Logistic Regression Sentiment Analysis Model

Model Details

Intended Use

Dataset

Test Performance

Files Included

Limitations

Dataset used to train nhull/logistic-regression-model

Space using nhull/logistic-regression-model 1

Collection including nhull/logistic-regression-model

NLP ZG