metadata
license: mit
datasets:
- Aditya1010/17k-hotel-reviews-dataset
metrics:
- accuracy
base_model:
- distilbert/distilbert-base-uncased
pipeline_tag: text-classification
library_name: transformers
tags:
- Sentiment Analysis
- DistilBERT
- Text Classification
- Hotel Reviews
Hotel Review Classifier
This model is a sentiment classification model for hotel reviews, trained to predict whether a review is positive or negative. The model was fine-tuned using the distilbert-base-uncased
model architecture, based on the DistilBERT model from Hugging Face, and trained on the 17k Hotel Reviews Dataset.
Model Details
- Model Type: DistilBERT-based model for sequence classification
- Model Architecture:
distilbert-base-uncased
- Number of Parameters: Approximately 66M parameters
- Training Dataset: The model was trained on the
17k-hotel-reviews-dataset
, which contains 17,000 hotel reviews with labels for sentiment (positive/negative). - Fine-Tuning Task: Sentiment analysis for hotel reviews (positive or negative sentiment)
Training Data
- Dataset: 17k Hotel Reviews Dataset
- Data Description: The dataset consists of 17,000 hotel reviews, each labeled with a sentiment (positive/negative).
- Preprocessing: The dataset was preprocessed by cleaning the reviews to remove unwanted characters and URLs.
Training Details
- Training Framework: Hugging Face Transformers and PyTorch
- Learning Rate: 2e-5
- Epochs: 3
- Batch Size: 16
- Optimizer: AdamW
- Training Time: Approximately 2 hours on a GPU
Usage
To use the model for inference, you can use the following code:
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch
# Load the fine-tuned model and tokenizer
model = AutoModelForSequenceClassification.from_pretrained("kmack/HotelReviewClassifier")
tokenizer = AutoTokenizer.from_pretrained("kmack/HotelReviewClassifier")
# Example review for prediction
review = "This is the best hotel I've ever stayed in!"
# Tokenize the input text
inputs = tokenizer(review, return_tensors="pt", padding=True, truncation=True)
# Get predictions
with torch.no_grad():
outputs = model(**inputs)
# Get the predicted label (0 for negative, 1 for positive)
prediction = torch.argmax(outputs.logits, dim=-1)
print(f"Predicted sentiment: {'Positive' if prediction == 1 else 'Negative'}")
Citation
If you use this model in your research, please cite the following:
author = {Kmack},
title = {Hotel Review Classifier},
year = {2024},
url = {https://huggingface.co/kmack/HotelReviewClassifier}
}