DistilBERT Fine-Tuned on IMDB for Masked Language Modeling (Accelerate)

Model Description

This model is a fine-tuned version of distilbert-base-uncased for the masked language modeling (MLM) task. It has been trained on the IMDb dataset using the Hugging Face 🤗 Accelerate library.


Model Training Details

Training Dataset

  • Dataset: IMDB dataset from Hugging Face.
  • Dataset Splits:
    • Train: 25,000 samples
    • Test: 25,000 samples
    • Unsupervised: 50,000 samples
  • Training Strategy:
    • Combined the train and unsupervised splits for training, resulting in 75,000 training examples.
    • Applied fixed random masking to the evaluation set to ensure consistent perplexity scores.

Training Configuration

The model was trained using the following parameters:

  • Number of Training Epochs: 10
  • Batch Size: 64 (per device).
  • Learning Rate: 5e-5
  • Weight Decay: 0.01
  • Evaluation Strategy: After each epoch.
  • Early Stopping: Enabled (Patience = 3).
  • Metric for Best Model: eval_loss
    • Direction: Lower eval_loss is better (greater_is_better = False).
  • Learning Rate Scheduler: Linear decay with no warmup steps.
  • Mixed Precision Training: Enabled (FP16).

Model Results

Best Epoch Performance

  • Best Epoch: 9
  • Loss: 2.0173
  • Perplexity: 7.5178

Early Stopping

  • The training ran for the full 10 epochs as the evaluation loss continued to improve.

Model Usage

This fine-tuned model can be used for masked language modeling tasks using the fill-mask pipeline from Hugging Face. Below is an example:

from transformers import pipeline

mask_filler = pipeline("fill-mask", model="Prikshit7766/distilbert-finetuned-imdb-mlm-accelerate")

text = "This is a great [MASK]."
predictions = mask_filler(text)

for pred in predictions:
    print(f">>> {pred['sequence']}")

Example Output:

>>> This is a great movie.
>>> This is a great film.
>>> This is a great show.
>>> This is a great story.
>>> This is a great documentary.
Downloads last month
5
Safetensors
Model size
67M params
Tensor type
F32
·
Inference Examples
Unable to determine this model's library. Check the docs .

Model tree for Prikshit7766/distilbert-finetuned-imdb-mlm-accelerate

Finetuned
(7083)
this model

Dataset used to train Prikshit7766/distilbert-finetuned-imdb-mlm-accelerate