Spaces:

seanpedrickcase
/

document_redaction

Running

Ctrl+K

3 contributors

Added config options for compressing output pdfs, returning output redacted pdfs at all, and for changing the length of time for showing previous Textract jobs

3bbf593 about 1 month ago

__init__.py

0 Bytes

Initial commit about 1 year ago
auth.py

2.46 kB

Added compatibility with gradio_image_annotation for passing through id and text properties to annotator. Corrected csv location for Textract api calls. Other minor changes about 2 months ago
aws_functions.py

9.47 kB

Improved logging format a little. Now possible to save logs to DynamoDB about 2 months ago
aws_textract.py

27.3 kB

Now local OCR outputs can be saved to file and reloaded to save preparation time. Bug fixing in logs and tabular data redaction. Update to documentation about 2 months ago
cli_redact.py

4.74 kB

More config options. Fixed some bugs with removing elements from review page and Adobe export. Some UI rearrangements 3 months ago
config.py
14.3 kB

Added config options for compressing output pdfs, returning output redacted pdfs at all, and for changing the length of time for showing previous Textract jobs about 1 month ago
custom_csvlogger.py

12.8 kB

Updated logging format for timestamps to be compatible with AWS. Added load_dynamo_logs.py example file. about 2 months ago
custom_image_analyser_engine.py

53.9 kB

Now local OCR outputs can be saved to file and reloaded to save preparation time. Bug fixing in logs and tabular data redaction. Update to documentation about 2 months ago
data_anonymise.py

35.9 kB

Now local OCR outputs can be saved to file and reloaded to save preparation time. Bug fixing in logs and tabular data redaction. Update to documentation about 2 months ago
file_conversion.py

100 kB

Added config options for compressing output pdfs, returning output redacted pdfs at all, and for changing the length of time for showing previous Textract jobs about 1 month ago
file_redaction.py
119 kB

Added config options for compressing output pdfs, returning output redacted pdfs at all, and for changing the length of time for showing previous Textract jobs about 1 month ago
find_duplicate_pages.py

9.87 kB

Corrected a couple of bugs. Now Textract whole document API call outputs will load also the input PDF into the app about 1 month ago
helper_functions.py

26.3 kB

Now local OCR outputs can be saved to file and reloaded to save preparation time. Bug fixing in logs and tabular data redaction. Update to documentation about 2 months ago
load_spacy_model_custom_recognisers.py

13.7 kB

Major update. General code revision. Improved config variables. Dataframe based review frame now includes text, items can be searched and excluded. Costs now estimated. Option for adding cost codes added. Option to extract text only. 2 months ago
presidio_analyzer_custom.py

4.92 kB

More config options. Fixed some bugs with removing elements from review page and Adobe export. Some UI rearrangements 3 months ago
redaction_review.py

81 kB

Added config options for compressing output pdfs, returning output redacted pdfs at all, and for changing the length of time for showing previous Textract jobs about 1 month ago
textract_batch_call.py
27.9 kB

Added config options for compressing output pdfs, returning output redacted pdfs at all, and for changing the length of time for showing previous Textract jobs about 1 month ago