MARTINI_enrich_BERTopic_GeneralMCNews

This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.

Usage

To use this model, please install BERTopic:

pip install -U bertopic

You can use the model as follows:

from bertopic import BERTopic
topic_model = BERTopic.load("AIDA-UPM/MARTINI_enrich_BERTopic_GeneralMCNews")

topic_model.get_topic_info()

Topic overview

  • Number of topics: 38
  • Number of training documents: 3780
Click here for an overview of all topics.
Topic ID Topic Keywords Topic Frequency Label
-1 biden - fbi - states - vaccine - ukraine 20 -1_biden_fbi_states_vaccine
0 ballots - maricopa - rigged - recount - republican 1715 0_ballots_maricopa_rigged_recount
1 gaza - netanyahu - jerusalem - airstrikes - terrorists 198 1_gaza_netanyahu_jerusalem_airstrikes
2 gunman - victims - active - nypd - nashville 149 2_gunman_victims_active_nypd
3 twitter - dorsey - banned - paypal - starlink 139 3_twitter_dorsey_banned_paypal
4 pelosi - congressman - mccarthy - republicans - impeachment 111 4_pelosi_congressman_mccarthy_republicans
5 vaccines - paxlovid - myocarditis - mrna - transfusion 101 5_vaccines_paxlovid_myocarditis_mrna
6 biden - fbi - whistleblowers - bribery - subpoena 87 6_biden_fbi_whistleblowers_bribery
7 zelensky - ukrainians - volodymyr - belarus - medvedev 75 7_zelensky_ukrainians_volodymyr_belarus
8 blackouts - electricity - shortages - california - surging 71 8_blackouts_electricity_shortages_california
9 arrested - trafficking - rapist - indicted - investigators 69 9_arrested_trafficking_rapist_indicted
10 bolsonaro - petrobras - paulo - santos - janeiro 64 10_bolsonaro_petrobras_paulo_santos
11 doj - subpoenaed - declassified - bannon - dismissed 63 11_doj_subpoenaed_declassified_bannon
12 jpmorgan - billionaire - jeffrey - ghislaine - zuckerman 57 12_jpmorgan_billionaire_jeffrey_ghislaine
13 misgendered - lgbtq - minors - school - genitals 55 13_misgendered_lgbtq_minors_school
14 pandemics - nipah - influenza - h5n1 - poliovirus 49 14_pandemics_nipah_influenza_h5n1
15 kardashian - balenciaga - megyn - skky - milo 46 15_kardashian_balenciaga_megyn_skky
16 subliminal - satanists - pentagram - reptilian - symbolism 46 16_subliminal_satanists_pentagram_reptilian
17 imran - islamabad - peshawar - faizabad - overthrown 46 17_imran_islamabad_peshawar_faizabad
18 fires - explosion - haverstraw - warehouse - massive 45 18_fires_explosion_haverstraw_warehouse
19 migrants - border - texas - yuma - smuggling 45 19_migrants_border_texas_yuma
20 ufo - norad - airships - spyballoon - surveillance 44 20_ufo_norad_airships_spyballoon
21 additives - tyson - chicken - mcdonald - antibiotics 43 21_additives_tyson_chicken_mcdonald
22 taiwan - pelosi - spratly - squadrons - pingtan 40 22_taiwan_pelosi_spratly_squadrons
23 china - lockdowns - shenzhen - pcr - robotaxi 37 23_china_lockdowns_shenzhen_pcr
24 climate - propagandized - experts - robodogs - honeybee 36 24_climate_propagandized_experts_robodogs
25 bancorporation - fdic - depositors - plummet - crisis 35 25_bancorporation_fdic_depositors_plummet
26 unvaxxed - novavax - mandates - reinstated - plaintiffs 34 26_unvaxxed_novavax_mandates_reinstated
27 derailment - hazmat - ohio - dioxin - spills 32 27_derailment_hazmat_ohio_dioxin
28 biden - psaki - dictator - deepfaked - granddaughter 30 28_biden_psaki_dictator_deepfaked
29 fauci - remdesivir - usaid - rfk - collusion 29 29_fauci_remdesivir_usaid_rfk
30 indicted - trump - prosecutorial - manhattan - jurors 29 30_indicted_trump_prosecutorial_manhattan
31 brics - rubles - rupee - currencies - yuan 28 31_brics_rubles_rupee_currencies
32 zaporizhzhya - donetsk - mykolaiv - nuclear - kryvyi 28 32_zaporizhzhya_donetsk_mykolaiv_nuclear
33 aircraft - crashed - landed - turbulence - runway 22 33_aircraft_crashed_landed_turbulence
34 jfk - assassinating - snowden - julian - mossad 21 34_jfk_assassinating_snowden_julian
35 zuckerberg - meta - instagram - lawsuit - shareholder 21 35_zuckerberg_meta_instagram_lawsuit
36 desantis - governor - bush - gitmo - prosecute 20 36_desantis_governor_bush_gitmo

Training hyperparameters

  • calculate_probabilities: True
  • language: None
  • low_memory: False
  • min_topic_size: 10
  • n_gram_range: (1, 1)
  • nr_topics: None
  • seed_topic_list: None
  • top_n_words: 10
  • verbose: False
  • zeroshot_min_similarity: 0.7
  • zeroshot_topic_list: None

Framework versions

  • Numpy: 1.26.4
  • HDBSCAN: 0.8.40
  • UMAP: 0.5.7
  • Pandas: 2.2.3
  • Scikit-Learn: 1.5.2
  • Sentence-transformers: 3.3.1
  • Transformers: 4.46.3
  • Numba: 0.60.0
  • Plotly: 5.24.1
  • Python: 3.10.12
Downloads last month
5
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.