AG-News BERT Classification

Model Details

Model Name: AG-News BERT Classification
Model Type: Text Classification
Developer: Mansoor Hamidzadeh
Repository: mansoorhamidzadeh/ag-news-bert-classification
Language(s): English
License: MIT

Model Description

Overview

The AG-News BERT Classification model is a fine-tuned BERT (Bidirectional Encoder Representations from Transformers) model designed for text classification tasks, specifically for classifying news articles into four categories: World, Sports, Business, and Sci/Tech. The model leverages the pre-trained BERT architecture, which has been fine-tuned on the AG-News dataset to optimize its performance for this specific task.

Intended Use

Primary Use Case

The primary use case for this model is to automatically classify news articles into one of the four predefined categories:

World
Sports
Business
Sci/Tech

This can be useful for news aggregation services, content recommendation systems, and any application that requires automated content categorization.

Applications

News aggregators and curators
Content recommendation engines
Media monitoring tools
Sentiment analysis and trend detection in news

Training Data

Dataset

Name: AG-News Dataset
Source: AG News Corpus
Description: The AG-News dataset is a widely used benchmark dataset for text classification. It contains 120,000 training samples and 7,600 test samples of news articles categorized into four classes: World, Sports, Business, and Sci/Tech.

Data Preprocessing

The text data was preprocessed to tokenize the sentences using the BERT tokenizer, converting the tokens to their corresponding IDs, and creating attention masks.

Training Procedure

Training Configuration:

Number of Epochs: 4
Batch Size: 8
Learning Rate: 1e-5
Optimizer: AdamW

Training and Validation Losses:

Epoch 1:
- Average training loss: 0.1330
- Average test loss: 0.1762
Epoch 2:
- Average training loss: 0.0918
- Average test loss: 0.1733
Epoch 3:
- Average training loss: 0.0622
- Average test loss: 0.1922
Epoch 4:
- Average training loss: 0.0416
- Average test loss: 0.2305

Hardware:

Training Environment: NVIDIA P100 GPU
Training Time: Approximately 3 hours

Performance

Evaluation Metrics

The model was evaluated using standard text classification metrics:

Accuracy
Precision
Recall
F1 Score

Results

On the AG-News test set, the model achieved the following performance:

Accuracy: 93.8%
Precision: 93.8%
Recall: 93.8%
F1 Score: 93.8%

Limitations and Biases

Limitations

The model may not generalize well to other text types or news sources outside the AG-News dataset.
Primarily designed for English text and may not perform well on text in other languages.

Biases

Potential biases present in the training data, reflecting biases in news reporting.
Category-specific biases due to the distribution of articles in the dataset.

Ethical Considerations

Ensure the model is used in compliance with user privacy and data security standards.
Be aware of potential biases and take steps to mitigate negative impacts, especially in sensitive applications.

How to Use

Inference

To use the model for inference, load it using the Hugging Face Transformers library:

from transformers import BertTokenizer, BertForSequenceClassification
from transformers import TextClassificationPipeline

tokenizer = BertTokenizer.from_pretrained("mansoorhamidzadeh/ag-news-bert-classification")
model = BertForSequenceClassification.from_pretrained("mansoorhamidzadeh/ag-news-bert-classification")

pipeline = TextClassificationPipeline(model=model, tokenizer=tokenizer)

text = "Sample news article text here."
prediction = pipeline(text)
print(prediction)

@misc{mansoorhamidzadeh,
  author = {Mansoor Hamidzadeh},
  title = {AG-News BERT Classification},
  year = {2024},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/mansoorhamidzadeh/ag-news-bert-classification}},
}

mansoorhamidzadeh
/

ag-news-bert-classification