Spaces:
Build error
Build error
docs: add tree
Browse filesFormer-commit-id: 4bd5556ea204ff3f298386acbab2d43e5538a92e [formerly 1bd861ab14f0b19796ec55b9192678937c0f07f1]
Former-commit-id: a7e1bac1444032132725db899766fa54d2737ca5
README.md
CHANGED
@@ -34,10 +34,29 @@ graph TD;
|
|
34 |
|
35 |
## Features
|
36 |
|
37 |
-
- **Document Processing**: Extracts and processes text from
|
38 |
-
- **Summarisation**: Generates concise summaries each response, highlighting key points and
|
39 |
-
- **Thematic Analysis**: Breaks down responses into
|
40 |
-
- **Reporting**: Aggregates response summaries to produce an extensive final overview.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
41 |
|
42 |
## Installation
|
43 |
|
@@ -72,7 +91,9 @@ Alternatively run everything manually:
|
|
72 |
|
73 |
- **Environment Variables**: Use a `.env` file to store sensitive information like API keys.
|
74 |
- `OPENAI_API_KEY` required for summarisation.
|
|
|
75 |
- **Constants**: Adjust `Consts` in `planning_ai/common/utils.py` to modify token limits and other settings.
|
|
|
76 |
|
77 |
## Workflow
|
78 |
|
@@ -80,4 +101,6 @@ Alternatively run everything manually:
|
|
80 |
2. **Text Splitting**: Documents are split into manageable chunks using `CharacterTextSplitter`.
|
81 |
3. **Graph Processing**: The `StateGraph` orchestrates the flow of data through various nodes, including mapping and reducing summaries.
|
82 |
4. **Summarisation**: The `map_chain` and `reduce_chain` are used to generate and refine summaries using LLMs.
|
83 |
-
5. **Output**: Final summaries and thematic breakdowns are used to produce a final
|
|
|
|
|
|
34 |
|
35 |
## Features
|
36 |
|
37 |
+
- **Document Processing**: Extracts and processes text from `.json` and `.pdf` files.
|
38 |
+
- **Summarisation**: Generates concise summaries each response, highlighting key points and how they relate to policies.
|
39 |
+
- **Thematic Analysis**: Breaks down responses into themes.
|
40 |
+
- **Reporting**: Aggregates response summaries to produce an extensive final overview, and summary document.
|
41 |
+
|
42 |
+
## Project Tree
|
43 |
+
|
44 |
+
|
45 |
+
```bash
|
46 |
+
planning_ai/
|
47 |
+
βββ chains # llm calls with prompts using langchain
|
48 |
+
βββ common # shared utility functions
|
49 |
+
βββ documents # processing for final documents
|
50 |
+
βββ eval # evaluation functions to compare summaries to manual summaries
|
51 |
+
βββ graph.py # main langgraph functiosn
|
52 |
+
βββ llms # openai llm definitions
|
53 |
+
βββ logging.py # shared logging functiosn
|
54 |
+
βββ main.py # calls langgraph functions and document processing
|
55 |
+
βββ nodes # langgraph nodes that use chains to modify graph state
|
56 |
+
βββ preprocessing # functions for processing .json and .pdf files
|
57 |
+
βββ states.py # define the paramaters used by graph states
|
58 |
+
βββ themes.py # defines main themes and policies
|
59 |
+
```
|
60 |
|
61 |
## Installation
|
62 |
|
|
|
91 |
|
92 |
- **Environment Variables**: Use a `.env` file to store sensitive information like API keys.
|
93 |
- `OPENAI_API_KEY` required for summarisation.
|
94 |
+
- `AZURE_API_KEY` and `AZURE_API_ENDPOINT` needed to process `.pdfs`
|
95 |
- **Constants**: Adjust `Consts` in `planning_ai/common/utils.py` to modify token limits and other settings.
|
96 |
+
- The document output format may be altered using files in `planning_ai/document`
|
97 |
|
98 |
## Workflow
|
99 |
|
|
|
101 |
2. **Text Splitting**: Documents are split into manageable chunks using `CharacterTextSplitter`.
|
102 |
3. **Graph Processing**: The `StateGraph` orchestrates the flow of data through various nodes, including mapping and reducing summaries.
|
103 |
4. **Summarisation**: The `map_chain` and `reduce_chain` are used to generate and refine summaries using LLMs.
|
104 |
+
5. **Output**: Final summaries and thematic breakdowns are used to produce a final report.
|
105 |
+
|
106 |
+
Citations within the final report correspond with the document IDs attributed to responses in the summaries document.
|