Text_classification_model_1_pytorch

This model is a fine-tuned version of distilbert-base-uncased on the imdb dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3494
  • Accuracy: 0.9329

Model description

Introduction:

In the realm of natural language processing and sentiment analysis, the utilization of pre-trained language models has proven to be highly effective. One such model is DistilBERT Uncased, a distilled and smaller version of the powerful BERT model. In this project, we explore the application of DistilBERT Uncased for text classification, specifically focusing on sentiment analysis using the IMDb dataset.

Model Overview:

Our text classification model is built upon the foundation of DistilBERT Uncased. This model, developed by Hugging Face, is a variant of BERT that retains much of BERT's effectiveness while being lighter and faster. DistilBERT retains the bidirectional attention mechanism and the masked language model pre-training objective of BERT. Our aim is to fine-tune this pre-trained model to accurately predict the sentiment of movie reviews as either positive or negative.

Intended uses & limitations

we've demonstrated the effectiveness of fine-tuning DistilBERT Uncased for text classification, specifically for sentiment analysis using the IMDb dataset. Our model showcases the power of transfer learning, allowing it to leverage pre-trained knowledge and adapt it to a specific task. The fine-tuned model can accurately classify movie reviews as positive or negative, paving the way for efficient sentiment analysis in various applications.

Training and evaluation data

Dataset:

The IMDb dataset, a widely-used benchmark for sentiment analysis, consists of movie reviews labeled as positive or negative based on their sentiment. This dataset encompasses a wide range of reviews from IMDb, offering a diverse set of language patterns, tones, and opinions. By training our model on this dataset, we aim to enable it to learn the nuances of positive and negative sentiment expression.

Training procedure

Fine-Tuning Process:

Fine-tuning the DistilBERT Uncased model for sentiment analysis involves adapting the pre-trained model to our specific task. This process entails:

Data Preprocessing: The IMDb dataset is preprocessed, tokenized, and encoded into input features that DistilBERT Uncased can understand. These features include tokenized text and segment IDs, which differentiate between the actual text and padding tokens.

Fine-Tuning Architecture: We attach a classification layer on top of DistilBERT's transformer layers. This additional layer learns to map the contextualized embeddings generated by DistilBERT to sentiment labels (positive or negative).

Training: The model is trained using the training subset of the IMDb dataset. During training, the classification layer's weights are updated based on the model's predictions and the ground truth labels. We use cross-entropy loss as the optimization objective.

Validation: The model's performance is evaluated on a separate validation subset of the IMDb dataset. This helps us monitor its learning progress and make adjustments if needed.

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Accuracy
0.2336 1.0 1563 0.2718 0.903
0.162 2.0 3126 0.2392 0.9277
0.0971 3.0 4689 0.3191 0.9312
0.0535 4.0 6252 0.3211 0.9334
0.034 5.0 7815 0.3494 0.9329

Framework versions

  • Transformers 4.31.0
  • Pytorch 2.0.1+cu118
  • Datasets 2.13.1
  • Tokenizers 0.13.3
Downloads last month
118
Safetensors
Model size
67M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Hansaht/Text_classification_model_1_pytorch

Finetuned
(7298)
this model

Dataset used to train Hansaht/Text_classification_model_1_pytorch

Evaluation results