Spaces:

cjber
/

planning-ai

Build error

cjber commited on Feb 18

Commit

466cbea

1 Parent(s): a10efb2

docs: add tree

Former-commit-id: 4bd5556ea204ff3f298386acbab2d43e5538a92e [formerly 1bd861ab14f0b19796ec55b9192678937c0f07f1]
Former-commit-id: a7e1bac1444032132725db899766fa54d2737ca5

Files changed (1) hide show

README.md +28 -5

README.md CHANGED Viewed

@@ -34,10 +34,29 @@ graph TD;
 ## Features
-- **Document Processing**: Extracts and processes text from various document formats including PDFs and Excel files.
-- **Summarisation**: Generates concise summaries each response, highlighting key points and overall sentiment.
-- **Thematic Analysis**: Breaks down responses into thematic categories, providing a percentage breakdown of themes.
-- **Reporting**: Aggregates response summaries to produce an extensive final overview.
 ## Installation
@@ -72,7 +91,9 @@ Alternatively run everything manually:
 - **Environment Variables**: Use a `.env` file to store sensitive information like API keys.
     - `OPENAI_API_KEY` required for summarisation.
 - **Constants**: Adjust `Consts` in `planning_ai/common/utils.py` to modify token limits and other settings.
 ## Workflow
@@ -80,4 +101,6 @@ Alternatively run everything manually:
 2. **Text Splitting**: Documents are split into manageable chunks using `CharacterTextSplitter`.
 3. **Graph Processing**: The `StateGraph` orchestrates the flow of data through various nodes, including mapping and reducing summaries.
 4. **Summarisation**: The `map_chain` and `reduce_chain` are used to generate and refine summaries using LLMs.
-5. **Output**: Final summaries and thematic breakdowns are used to produce a final Quarto report.

 ## Features
+- **Document Processing**: Extracts and processes text from `.json` and `.pdf` files.
+- **Summarisation**: Generates concise summaries each response, highlighting key points and how they relate to policies.
+- **Thematic Analysis**: Breaks down responses into themes.
+- **Reporting**: Aggregates response summaries to produce an extensive final overview, and summary document.
+## Project Tree
+```bash
+planning_ai/
+├── chains  # llm calls with prompts using langchain
+├── common  # shared utility functions
+├── documents  # processing for final documents
+├── eval  # evaluation functions to compare summaries to manual summaries
+├── graph.py  # main langgraph functiosn
+├── llms  # openai llm definitions
+├── logging.py  # shared logging functiosn
+├── main.py  # calls langgraph functions and document processing
+├── nodes  # langgraph nodes that use chains to modify graph state
+├── preprocessing  # functions for processing .json and .pdf files
+├── states.py  # define the paramaters used by graph states
+└── themes.py  # defines main themes and policies
+```
 ## Installation
 - **Environment Variables**: Use a `.env` file to store sensitive information like API keys.
     - `OPENAI_API_KEY` required for summarisation.
+    - `AZURE_API_KEY` and `AZURE_API_ENDPOINT` needed to process `.pdfs`
 - **Constants**: Adjust `Consts` in `planning_ai/common/utils.py` to modify token limits and other settings.
+- The document output format may be altered using files in `planning_ai/document`
 ## Workflow
 2. **Text Splitting**: Documents are split into manageable chunks using `CharacterTextSplitter`.
 3. **Graph Processing**: The `StateGraph` orchestrates the flow of data through various nodes, including mapping and reducing summaries.
 4. **Summarisation**: The `map_chain` and `reduce_chain` are used to generate and refine summaries using LLMs.
+5. **Output**: Final summaries and thematic breakdowns are used to produce a final report.
+Citations within the final report correspond with the document IDs attributed to responses in the summaries document.