bart-news Model

Welcome to the bart-news model repository! This model is a fine-tuned version of the facebook/bart-large-cnn specifically adapted for English news summarization. It is trained on the comprehensive English News Summary dataset from Kaggle.

Model Description

The bart-news model leverages the powerful BART architecture, which is particularly effective in sequence-to-sequence tasks. Our fine-tuning efforts on an extensive news dataset ensure that this model is well-suited for generating concise and relevant summaries of news articles.

Training Environment

Training was conducted on Google Cloud Platform's Vertex AI, utilizing a NVIDIA L4 GPU with 24GB of VRAM. This setup provided the necessary compute power to handle the demands of full-dataset training efficiently.

Performance

The model's performance was rigorously evaluated using the ROUGE score metric across several epochs. Below is a detailed breakdown of the training and validation losses, along with the ROUGE scores for better insight into the model's summarization capabilities:

Epoch Performance Metrics

Epoch Training Loss Validation Loss ROUGE-1 ROUGE-2 ROUGE-L
1 No log 1.372020 r: 0.493, p: 0.455, f: 0.469 r: 0.264, p: 0.241, f: 0.249 r: 0.447, p: 0.414, f: 0.426
2 1.507800 1.338476 r: 0.505, p: 0.460, f: 0.478 r: 0.276, p: 0.246, f: 0.257 r: 0.458, p: 0.418, f: 0.434
3 1.371900 1.329648 r: 0.508, p: 0.462, f: 0.480 r: 0.278, p: 0.247, f: 0.259 r: 0.462, p: 0.420, f: 0.437

ROUGE scores are presented in the format of recall (r), precision (p), and f-measure (f).

Try it!

Try this app on huggingface Inference API or try the Streamlit app based on this model: News Summarization APP

GitHub Repository

For more details, scripts, and usage examples, visit our GitHub repository: News Summarization GitHub Repository

Conclusion

The bart-news model represents a significant step forward in the field of automated news summarization. Its training on a targeted dataset ensures that it delivers high-quality summaries, making it a valuable tool for news agencies, researchers, and anyone interested in quick and reliable news digestion.

We hope you find this model useful for your summarization needs. Feel free to explore the model, test it with new data, and contribute to its ongoing improvement.

Downloads last month
8
Safetensors
Model size
406M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.