metadata
license: mit
language:
- en
pipeline_tag: text2text-generation
News2Topic-V2-Flan-T5-base
Model Details
- Model type: Text-to-Text Generation
- Language(s) (NLP): English
- License: MIT License
- Finetuned from model: FLAN-T5 Base Model (Google AI)
Uses
The News2Topic Flan T5-base model is designed for automatic generation of topic names from news articles or news-like text. It can be integrated into news aggregation platforms, content management systems, or used for enhancing news browsing and searching experiences by providing concise topics.
How to Get Started with the Model
from transformers import pipeline
pipe = pipeline("text2text-generation", model="textgain/News2Topic-V2-Flan-T5-base")
news_text = "Your news text here."
print(pipe(news_text))
Training Details
The News2Topic V2 Flan T5-base model was trained on a 20K sample of the "Newsroom" dataset (https://lil.nlp.cornell.edu/newsroom/index.html), annotated with data generated by a fine-tuned GPT-3.5-turbo on synthetic curated data.
The model was trained for 10 epochs, with a learning rate of 0.00001, a maximum sequence length of 512, and a training batch size of 12.
Citation
BibTeX:
@article{Kosar_DePauw_Daelemans_2024,
title={Comparative Evaluation of Topic Detection: Humans vs. LLMs}, volume={13},
url={https://www.clinjournal.org/clinj/article/view/173}, journal={Computational Linguistics in the Netherlands Journal},
author={Kosar, Andriy and De Pauw, Guy and Daelemans, Walter},
year={2024},
month={Mar.},
pages={91–120} }