Reduce outliers now more efficient and relabels with correct vectoriser. Default topic labels now tidier. Hiearchical topics outputs more useful for joining to df afterwards. Switched low resource reduction algorithm to UMAP as default is not good. e1c1f68 Sonnyjim commited on Feb 7, 2024
Should now parse custom regex correctly. Will now wipe previously created embeddings if 'low resource mode' option switched. 0a543a0 Sean-Case commited on Feb 7, 2024
Allowed for uploading custom regex for cleaning. Fixed calculate all probabilities, reduce outliers. Added text tree for hierarchical modelling. 381f959 Sonnyjim commited on Feb 6, 2024
Upgraded to Gradio 4.16.0. Guide for converting to exe added. 0a177ca Sonnyjim commited on Feb 5, 2024
Added note to say that LLM representation is not currently working on the HF website 3b4333f Sean-Case commited on Feb 2, 2024
LLM model save is failing in Huggingface - attempting instead to save to base folder c2bf185 Sean-Case commited on Feb 2, 2024
Some text changes. Fixed a couple of TF-IDF embeddings issues 87306c7 Sean-Case commited on Feb 2, 2024
Switched embeddings to low resource TF-IDF by default. Some text changes. a7fdf3b Sean-Case commited on Feb 2, 2024
Added clean data options, improved re-representation options and visualisation. General format changes 4effac0 Sonnyjim commited on Feb 2, 2024
Allowed for loading in external topic labels. A few visualisation modifications. b27bab2 Sonnyjim commited on Jan 29, 2024
Model save now checks and makes a folder before writing the model 356791c Sonnyjim commited on Jan 29, 2024
Lots of general fixes. New visualisations, fixed hierarchical vis for zero shot. Added calc all probabilities. b4510a6 Sonnyjim commited on Jan 29, 2024
Changed Phi model to smaller StableLM 2 1.6. Fixed a None type detection error. 1f1a1c7 Sonnyjim commited on Jan 27, 2024
Disabled console logging as it was getting in the way of file load into the app 731ed23 Sonnyjim commited on Jan 26, 2024
Switched embeddings model to BGE Small 1.5 as Jina seemed unable to do zero shot topic modelling properly be094ee Sonnyjim commited on Jan 26, 2024
Added minimum similarity slider for zero shot topic modelling 0fe5421 Sean-Case commited on Jan 26, 2024
Split off LLM representation, visualisation, and reduce outliers from main function. Added hierarchical visualisation and logs 5d87c3c Sonnyjim commited on Jan 26, 2024
More efficient embeddings save and representations load/process. Custom visualisation hover option added, formatting improvements. Version 0.1? ffe5eb2 Sonnyjim commited on Jan 25, 2024
App should now check if embeddings are loaded before topic modelling. And will save only once. 9eeba1e Sonnyjim commited on Jan 25, 2024
Hopefully fixed install and load of LLM model on systems without a HF_HOME environmental variable 32cf9fb Sean-Case commited on Jan 25, 2024
Returned TruncatedSVD components to 100 - higher values don't seem to help 43ac0d8 Sean-Case commited on Jan 24, 2024
Greatly increased low resource process dimensions for higher quality. Visualisations disabled by default to increase speed. fac3624 Sean-Case commited on Jan 24, 2024
Greatly improved low resource mode speed (at cost of potential quality) aa3df37 Sean-Case commited on Jan 24, 2024
Added controls for saving topic models and visualisation. Removed custom UMAP layer 81f1b56 Sonnyjim commited on Jan 24, 2024
Now should save embeddings by default. Added random seed to representation e0f53cc Sean-Case commited on Jan 23, 2024
Fixed llm_config, environmental variable, zero shot topic model errors with quick embeddings ff32b4a Sean-Case commited on Jan 23, 2024
Model export changed to safetensors. Improved representational model function. Got zero shot topic modelling working 4cfed8e Sean-Case commited on Jan 23, 2024