Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
Spaces:
seanpedrickcase
/
document_redaction
like
4
Running
App
Files
Files
Community
Fetching metadata from the HF Docker repository...
9de60e6
document_redaction
/
tools
3 contributors
History:
69 commits
Sean Pedrick-Case
Removed some placeholder values
b1b0e04
unverified
about 2 months ago
__init__.py
Safe
0 Bytes
Initial commit
10 months ago
auth.py
Safe
2.55 kB
Removed some placeholder values
about 2 months ago
aws_functions.py
Safe
7.37 kB
Fixed issue where redactions were sometimes not removing text underneath boxes. You can now redact in different colours from review page
2 months ago
aws_textract.py
Safe
10.8 kB
Updated packages. Reinstituted multithreading with page load, now with order protected. Smaller spacy model used for speed. Textract calls should now be faster
2 months ago
cli_redact.py
Safe
4.73 kB
Allowed for overwriting of default output folder in choose_and_run_redactor function.
3 months ago
custom_csvlogger.py
Safe
6.65 kB
Created custom csvlogger to try to overcome AWS Lambda's incompatibility with multithread locks
3 months ago
custom_image_analyser_engine.py
Safe
39 kB
Started adding in support for custom deny list. Fixed textract call issue. Removed multithreading for now as it mixes up pages
2 months ago
data_anonymise.py
Safe
20.9 kB
Added support for AWS Comprehend for PII identification. OCR and detection results now written to main output
4 months ago
file_conversion.py
Safe
33.2 kB
Adapted text join options to review file to be more resilient to changes in image size. Added possibility of using client secret with AWS login
about 2 months ago
file_redaction.py
Safe
103 kB
Side review bar is mostly there. A couple of bugs fixed. Can now return identified text in initial review files. Still working on retaining found text throughout review process
about 2 months ago
helper_functions.py
Safe
11.8 kB
Refactor redaction functionality and enhance UI components: Added support for custom recognizers and whole page redaction options. Updated file handling to include new dropdowns for entity selection and improved dataframes for entity management. Enhanced the annotator with better state management and UI responsiveness. Cleaned up redundant code and improved overall performance in the redaction process.
about 2 months ago
load_spacy_model_custom_recognisers.py
Safe
6.57 kB
Refactor redaction functionality and enhance UI components: Added support for custom recognizers and whole page redaction options. Updated file handling to include new dropdowns for entity selection and improved dataframes for entity management. Enhanced the annotator with better state management and UI responsiveness. Cleaned up redundant code and improved overall performance in the redaction process.
about 2 months ago
presidio_analyzer_custom.py
Safe
4.94 kB
Added support for AWS Comprehend for PII identification. OCR and detection results now written to main output
4 months ago
redaction_review.py
Safe
13.9 kB
Adapted text join options to review file to be more resilient to changes in image size. Added possibility of using client secret with AWS login
about 2 months ago