--- license: openrail ---

PDF Paragraphs Extraction

A model for extracting paragraphs from PDFs

This model uses features from the PDF to extract the text and paragraphs from it. It can be used as a service. The paragraphs contain the page number, the position in the page, the size, and the text. ## Quick Start Download the service that uses the model: git clone https://github.com/huridocs/pdf_paragraphs_extraction.git cd pdf_paragraphs_extraction Start the service: ./run start Get the paragraphs from a PDF: curl -X GET -F 'file=@/PATH/TO/PDF/pdf_name.pdf' localhost:5051 To stop the server: ./run stop ## Performance Accuracy: 93.9% Speed: 0.15 seconds per page