Model Card for SentimentBERT
This model is a fine-tuned version of bert-base-uncased
for sentiment analysis. It has been trained on the Sentiment140 Kaggle dataset, enabling it to classify text as positive or negative.
Model Details
Model Description
This model is fine-tuned using the bert-base-uncased
architecture to perform sentiment analysis. It accepts text input and predicts whether the sentiment expressed in the text is positive or negative.
- Developed by: Debopam(Pritam) Dey
- Funded by [optional]: Not specified
- Shared by [optional]: Debopam(Pritam) Dey
- Model type: Sequence classification (binary sentiment analysis)
- Language(s) (NLP): English
- License: Apache 2.0
- Finetuned from model [optional]: bert-base-uncased
Model Sources [optional]
- Repository: SentimentBERT
- Demo [optional]: Coming Soon
Uses
Here’s how to use the model for sentiment analysis:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
# Load the model and tokenizer from the Hugging Face model hub
mymodel = AutoModelForSequenceClassification.from_pretrained("pritam2014/SentimentBERT")
mytokenizer = AutoTokenizer.from_pretrained("pritam2014/SentimentBERT")
# Preprocess the text input
def preprocess_text(text):
inputs = mytokenizer.encode_plus(
text,
max_length=50,
padding='max_length',
truncation=True,
return_attention_mask=True,
return_tensors='pt'
)
return inputs
# Predict sentiment
def make_prediction(text):
inputs = preprocess_text(text)
with torch.no_grad():
outputs = mymodel(inputs['input_ids'], attention_mask=inputs['attention_mask'])
logits = outputs.logits
predicted_class_id = torch.argmax(logits).item()
sentiment_labels = {0: 'Negative', 1: 'Positive'}
return sentiment_labels[predicted_class_id]
# Example
text = "I love this product!"
print(make_prediction(text)) # Output: Positive
Direct Use
The model can be used for text classification tasks without additional fine-tuning.
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("pritam2014/SentimentBERT")
model = AutoModelForSequenceClassification.from_pretrained("pritam2014/SentimentBERT")
from transformers import pipeline
# Initialize pipeline
sentiment_pipeline = pipeline("sentiment-analysis", model=model, tokenizer=tokenizer)
# Example input
tweets = [
"I love this product!",
"I'm not happy with the service.",
"It's okay, could be better."
]
# Predict sentiment
results = sentiment_pipeline(tweets)
for tweet, result in zip(tweets, results):
print(f"Tweet: {tweet}\nSentiment: {result['label']}, Score: {result['score']:.4f}\n")
Downstream Use [optional]
Users can fine-tune the model on other sentiment datasets or adapt it for related tasks like emotion detection.
Out-of-Scope Use
The model is not suitable for multilingual sentiment analysis or highly nuanced text where sentiment depends on complex context.
Bias, Risks, and Limitations
- The model may inherit biases present in the Sentiment140 dataset.
- It is designed for English text and may perform poorly on non-English or mixed-language text.
Recommendations
Use the model in scenarios where binary sentiment classification is sufficient. Avoid deploying it in critical systems without further testing for biases and limitations.
How to Get Started with the Model
Refer to the "Uses" section above to see the sample usage code. For more details, visit the Hugging Face Hub page.
Training Details
Training Data
The model was fine-tuned on the Sentiment140 dataset, which contains 1.6 million tweets labelled as positive or negative.
Training Procedure
- Optimizer: AdamW
- Batch size: 760
- Learning rate: 1e-5
- Epochs: 2
- Hardware: Kaggle T4 GPU
Preprocessing [optional]
[More Information Needed]
Training Hyperparameters
- Training regime: [More Information Needed]
Speeds, Sizes, Times [optional]
[More Information Needed]
Evaluation
The model was evaluated on a validation split of the Sentiment140 dataset.
Testing Data, Factors & Metrics
Testing Data
[More Information Needed]
Factors
[More Information Needed]
Metrics
[More Information Needed]
Results
[More Information Needed]
Summary
Model Examination [optional]
[More Information Needed]
Environmental Impact
Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).
- Hardware Type: [More Information Needed]
- Hours used: [More Information Needed]
- Cloud Provider: Kaggle T4 GPU
- Compute Region: [More Information Needed]
- Carbon Emitted: [More Information Needed]
Technical Specifications [optional]
Model Architecture and Objective
[More Information Needed]
Compute Infrastructure
[More Information Needed]
Hardware
[More Information Needed]
Software
[More Information Needed]
Citation [optional]
BibTeX:
@misc{pritam2014SentimentBERT, author = {Debopam(Pritam) Dey}, title = {SentimentBERT}, year = {2025}, publisher = {Hugging Face}, howpublished = {\url{https://huggingface.co/pritam2014/SentimentBERT}}, }
APA:
[More Information Needed]
Glossary [optional]
[More Information Needed]
More Information [optional]
The model performs well on short texts like tweets but may require further fine-tuning for longer or domain-specific text.
Model Card Authors [optional]
[More Information Needed]
Model Card Contact
For questions or feedback, feel free to contact me via the Hugging Face repository or email at ([email protected])
- Downloads last month
- 120
Model tree for pritam2014/SentimentBERT
Base model
google-bert/bert-base-uncased