Spaces:
Sleeping
Sleeping
Commit History
Changed embedding model to MiniLM-L6 as faster. Compressed embeddings are now int8. General improvements to API mode
ea0dd40
Minor changes to function outputs. Attempted Python downgrade to 3.10 to address xlsx output issues
2806807
General code improvements and refinements.
a95ef9f
Set bm25 in functions explicitly. Some API updates. Now can get connection params on startup.
2393537
Changed all intermediate file outputs to save to output folder
fea085c
Allowed for custom output folder, returned Dockerfile to work under user account and port 7860
d3ff2e2
Correct bm25 filename usage
4bb8d6f
Now checks for output folder before saving. Minor code cleaning
2089141
Fixed cleaning for semantic search. Handles text with backslashes in (if cleaned). Updated packages. requirements file for only keyword search added.
8466e45
Improved code for cleaning and outputting files. Added Dockerfile
4ee3470
Sean-Case
commited on
Improved xlsx output formatting. Deals better with cleaning data then analysing in same session.
352c02a
Sean-Case
commited on
Added highlight search term functionality to keyword search output
36a404e
Updated to Gradio 4.16.0. Now works correctly with BGE embeddings
2bcd818
Upgraded to Gradio 4.16.0. Added Spacy fuzzy search functionality.
4ce2224
Sean-Case
commited on
Cut out semantic search temporarily while issues with Jina gated model resolved. Improved error/progress tracking and messaging. Placeholder for Spacy fuzzy search.
739b386
Better error checking. Doesn't load in embeddings file twice now.
63049fe
Sean-Case
commited on
Fixed data input for semantic search. Allowed for docs to be loaded in directly for semantic search. 0.2.1
3df8e40
Sean-Case
commited on