AyoubChLin
/

bertopic_cnn_news

CNN news articles

Model card Files Files and versions Community

bertopic_cnn_news / README.md

AyoubChLin's picture

Update README.md

a20a891 over 1 year ago

|

1.4 kB

	---
	license: apache-2.0
	datasets:
	- AyoubChLin/CNN_News_Articles_2011-2022
	language:
	- en
	tags:
	- topic modeling
	- BERT
	- CNN news articles
	---
	# BERTopic Model for CNN News Articles

	This model is a BERTopic model fine-tuned on CNN news articles. It uses the sentence transformer model "all-MiniLM-L6-v2" to encode the sentences and UMAP for dimensionality reduction.

	## Usage

	First, install the required packages:

	```console
	pip install sentence_transformers umap-learn bertopic
	```

	``` python

	Then, load the model and encode your documents:

	```python
	from sentence_transformers import SentenceTransformer
	from umap import UMAP
	from bertopic import BERTopic

	# Load the sentence transformer model
	sentence_model = SentenceTransformer("all-MiniLM-L6-v2")

	# Set the random state in the UMAP model to prevent stochastic behavior
	umap_model = UMAP(n_neighbors=15, n_components=5, min_dist=0.0, metric='cosine', random_state=42)

	# Load the BERTopic model
	my_model = BERTopic.load("from/path/model.bin")

	# Encode your documents
	document_embeddings = sentence_model.encode(documents)
	```


	# predict :


	```python

	sentences = "my sentence"

	embeddings = sentence_model.encode([sentences])

	topic , _ =my_model.transform([sentences],embeddings)

	```


	For more information on how to use the BERTopic model, see the (BERTopic documentation)[https://maartengr.github.io/BERTopic/index.html].