license: apache-2.0
datasets:
- AyoubChLin/CNN_News_Articles_2011-2022
language:
- en
metrics:
- accuracy
pipeline_tag: text-classification
tags:
- news classification
widget:
- text: money in the pocket
- text: no one can win this cup in quatar..
Fine-Tuned BART Model for Text Classification on CNN News Articles
This is a fine-tuned BART (Bidirectional and Auto-Regressive Transformers) model for text classification on CNN news articles. The model was fine-tuned on a dataset of CNN news articles with labels indicating the article topic, using a batch size of 32, learning rate of 6e-5, and trained for one epoch.
How to Use
Install
pip install transformers
Example Usage
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("IT-community/BART_cnn_news_text_classification")
model = AutoModelForSequenceClassification.from_pretrained("IT-community/BART_cnn_news_text_classification")
# Tokenize input text
text = "This is an example CNN news article about politics."
inputs = tokenizer(text, padding=True, truncation=True, max_length=512, return_tensors="pt")
# Make prediction
outputs = model(inputs["input_ids"], attention_mask=inputs["attention_mask"])
predicted_label = torch.argmax(outputs.logits)
print(predicted_label)
Evaluation
The model achieved the following performance metrics on the test set:
Accuracy: 0.9591836734693877
F1-score: 0.958301875401112
Recall: 0.9591836734693877
Precision: 0.9579673040369542
About Us
We are a scientific club from Saad Dahleb Blida University named IT Community, created in 2016 by students. We are interested in all IT fields, This work was done by IT Community Club.
Contributions
Added preprocessing code for CNN news articles
Improved model performance with additional fine-tuning on a larger dataset