Model Details

This project demonstrates the fine-tuning of the DistilBERT model on the IMDB dataset for text classification, using the Hugging Face Transformers library.

Model Architecture

  • Model: DistilBERT-base-uncased
  • Optimizer: AdamW
  • Loss Function: Cross-entropy loss
  • Epochs: 4
  • Learning Rate: 2e-5
  • Batch Size: 16

Dataset

The imdb data is the collection of reviews of movies categorized into TWO classes:

  • POSITIVE
  • NEGATIVE

You can access the dataset via the Hugging Face datasets library.

Training Configuration

The training arguments are set as follows:

training_args = TrainingArguments(
    output_dir="distilbert-base-uncased-finetuned-sentiment-analysis",
    learning_rate=2e-5,
    per_device_train_batch_size=16,
    per_device_eval_batch_size=16,
    num_train_epochs=4,
    weight_decay=0.01,
    evaluation_strategy="epoch",
    save_strategy="epoch",
    load_best_model_at_end=True,
    push_to_hub=True,
)

You can change the parameters according to your requirements!!

Model Evaluation Results

Epoch Eval Loss Eval Accuracy
1 0.1881 92.90%
2 0.2331 93.39%
3 0.2919 93.39%
4 0.3253 93.67%

Dependencies

The required dependencies for this project are:

  • transformers
  • datasets
  • torch
  • sklearn
  • numpy

How to Use the Model

You can use the fine-tuned model for sentiment analysis using the Hugging Face pipeline as follows:

from transformers import pipeline

# Load the model from Hugging Face Hub
sentiment_analysis = pipeline("sentiment-analysis", model="Sathyam03/distilbert-base-uncased-finetuned-sentiment-analysis")

# Example usage
reviews = [
    "I absolutely loved this movie! It was fantastic.",
    "The film was okay, but it dragged on in some parts.",
    "I didn't like this movie at all. It was boring."
]

results = sentiment_analysis(reviews)

# Print the results
for review, result in zip(reviews, results):
    print(f"Review: {review}")
    print(f"Sentiment: {result['label']}, Confidence: {result['score']:.4f}\n")
)
Downloads last month
17
Safetensors
Model size
67M params
Tensor type
F32
·
Inference API
Unable to determine this model's library. Check the docs .