Consolidated AWS Comprehend redaction calls to reduce total number 542c252 seanpedrickcase commited on Nov 6, 2024
When on AWS, now loads in a default allow_list to exclude common words from redaction. Improved checks on AWS Comprehend calls. 390bef2 seanpedrickcase commited on Nov 6, 2024
Added support for AWS Comprehend for PII identification. OCR and detection results now written to main output f0f9378 seanpedrickcase commited on Nov 5, 2024
Allowed for time limits on redact to avoid timeouts. Improved review interface. Now accepts only one file at a time. Upgraded Gradio version eea5c07 seanpedrickcase commited on Nov 5, 2024
Upgraded packages. Fixed some issues with review process. Better progress reporting for user. 5b4b5fb seanpedrickcase commited on Oct 15, 2024
Added 'Review redactions' tab to the app. You can now visually inspect suggested redactions and modify/add with a point and click interface. ebf9010 seanpedrickcase commited on Oct 15, 2024
Redaction tool can now export pdfs with selectable text retained - redacted text is deleted and covered with a black box. Licence change for pymupdf use. 339a165 seanpedrickcase commited on Sep 27, 2024
General improvement in quick image matching and merging 84c83c0 seanpedrickcase commited on Sep 26, 2024
Improved allow list, handwriting/signature identification, logging 6ea0852 seanpedrickcase commited on Sep 19, 2024
Added AWS Textract support. Allowed for OCR logs export. e9c4101 seanpedrickcase commited on Sep 18, 2024
Enhanced logging of usage. Small buffer added to redaction rectangles as it seems to miss the tops of text often. 34addbf seanpedrickcase commited on Sep 16, 2024
Can now select only specific pages in document to redact. Image based redaction should work correctly now. bc4bdbd seanpedrickcase commited on Sep 3, 2024
Handles multiple runs with multiple files correctly now. Logging and feedback improvements. bbf818d seanpedrickcase commited on Aug 21, 2024
Decision process now saved as log files. Other log files and feedback added 8c33828 seanpedrickcase commited on Aug 20, 2024
Added logging, anonymising all Excel sheets, simple redaction tags, some Dockerfile optimisation 01c88c0 seanpedrickcase commited on Aug 15, 2024
Added possibility to do authentication with AWS Cognito on load. Other minor changes. bc22fc4 seanpedrickcase commited on Jul 15, 2024
Can now redaction text or csv/xlsx files. Can redact multiple files. Embeds redactions as image-based file by default 7810536 seanpedrickcase commited on Jun 21, 2024
Better redaction output formatting. Custom output folders allowed. Upgraded Gradio version 12224f5 seanpedrickcase commited on Jun 6, 2024
Version 0.1. Adapted code for pyinstaller local executable conversion (Windows) 2a4b347 seanpedrickcase commited on May 22, 2024
Added TLDExtract cache files so that internet connection is not required dce6100 seanpedrickcase commited on May 20, 2024
Re-arranged image and text analysis to encourage text analysis (faster) 72a4f68 seanpedrickcase commited on May 16, 2024
Separated file preparation and file redaction functions. Hopefully sts endpoint access now works on AWS 0f18146 seanpedrickcase commited on May 15, 2024
Added -y to poppler-utils installation in Dockerfile. Added support for image files in image-based redaction. 37d982e seanpedrickcase commited on Apr 25, 2024