AD2000X commited on
Commit
e1cced0
·
verified ·
1 Parent(s): 296c836

Upload 14 files

Browse files
.streamlit/config.toml ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [server]
2
+ headless = true
3
+ enableCORS = false
4
+
5
+ [browser]
6
+ gatherUsageStats = false
7
+
8
+ [theme]
9
+ primaryColor = "#4B6BFF"
10
+ backgroundColor = "#FAFAFA"
11
+ secondaryBackgroundColor = "#F0F2F6"
12
+ textColor = "#262730"
13
+ font = "sans serif"
DEPLOYMENT_GUIDE.md ADDED
@@ -0,0 +1,139 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Deployment Guide for Ontology-Enhanced RAG System
2
+
3
+ This guide will help you deploy the Ontology-Enhanced RAG demonstration to Hugging Face Spaces.
4
+
5
+ ## Prerequisites
6
+
7
+ 1. **Hugging Face Account**: You need a Hugging Face account.
8
+ 2. **OpenAI API Key**: You need a valid OpenAI API key.
9
+
10
+ ## Deployment Steps
11
+
12
+ ### 1. Prepare Your Repository
13
+
14
+ Ensure your repository contains the following files and directories:
15
+
16
+ - `app.py`: Main Streamlit application
17
+ - `src/`: Directory containing all source code
18
+ - `data/`: Directory containing the ontology JSON and other data
19
+ - `.streamlit/`: Directory containing Streamlit configuration
20
+ - `static/`: Directory containing CSS and other static assets
21
+ - `requirements.txt`: List of all dependencies
22
+ - `huggingface.yml`: Hugging Face Space configuration
23
+
24
+ ### 2. Set Up Hugging Face Space
25
+
26
+ 1. Visit [Hugging Face](https://huggingface.co/) and log in
27
+ 2. Click "New" → "Space" in the top right corner
28
+ 3. Fill in the Space settings:
29
+ - **Owner**: Select your username or organization
30
+ - **Space name**: Choose a name for your demo, e.g., "ontology-rag-demo"
31
+ - **License**: Choose MIT or your preferred license
32
+ - **SDK**: Select Streamlit
33
+ - **Space hardware**: Choose according to your needs (minimum requirement: CPU + 4GB RAM)
34
+
35
+ 4. Click "Create Space"
36
+
37
+ ### 3. Configure Space Secrets
38
+
39
+ You need to add your OpenAI API key as a Secret:
40
+
41
+ 1. In your Space page, go to the "Settings" tab
42
+ 2. Scroll down to the "Repository secrets" section
43
+ 3. Click "New secret"
44
+ 4. Add the following secret:
45
+ - **Name**: `OPENAI_API_KEY`
46
+ - **Value**: Your OpenAI API key
47
+ 5. Click "Add secret"
48
+
49
+ ### 4. Upload Your Code
50
+
51
+ There are two ways to upload your code:
52
+
53
+ #### Option A: Upload via Web Interface
54
+
55
+ 1. In your Space page, go to the "Files" tab
56
+ 2. Use the upload button to upload all necessary files and directories
57
+ 3. Ensure you maintain the correct directory structure
58
+
59
+ #### Option B: Upload via Git (Recommended)
60
+
61
+ 1. Clone your Space repository:
62
+ ```bash
63
+ git clone https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME
64
+ ```
65
+
66
+ 2. Copy all your files into the cloned repository
67
+ 3. Add, commit, and push the changes:
68
+ ```bash
69
+ git add .
70
+ git commit -m "Initial commit"
71
+ git push
72
+ ```
73
+
74
+ ### 5. Verify Deployment
75
+
76
+ 1. Visit your Space URL (in the format `https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME`)
77
+ 2. Confirm that the application loads and runs correctly
78
+ 3. Test all features
79
+
80
+ ## Hardware Recommendations
81
+
82
+ For optimal performance, consider the following hardware configurations:
83
+
84
+ - **Minimal**: CPU + 4GB RAM (suitable for demos with limited users)
85
+ - **Recommended**: CPU + 16GB RAM (for better performance with knowledge graph visualizations)
86
+
87
+ ## Troubleshooting
88
+
89
+ If you encounter issues:
90
+
91
+ 1. **Application fails to start**:
92
+ - Check if the Streamlit version is compatible
93
+ - Verify all dependencies are correctly installed
94
+ - Check the Space logs for error messages
95
+
96
+ 2. **OpenAI API errors**:
97
+ - Confirm the API key is correctly set as a Secret
98
+ - Verify the API key is valid and has sufficient quota
99
+
100
+ 3. **Display issues**:
101
+ - Try simplifying visualizations, as they might be memory-intensive
102
+ - Check logs for any warnings or errors
103
+
104
+ 4. **NetworkX or Visualization Issues**:
105
+ - Ensure pygraphviz is properly installed
106
+ - For simpler deployment, you can modify the code to use alternative layout algorithms that don't depend on Graphviz
107
+
108
+ ## Deployment Optimizations
109
+
110
+ For production deployments, consider these optimizations:
111
+
112
+ 1. **Resource Management**:
113
+ - Choose appropriate hardware (CPU+RAM) to meet your application's needs
114
+ - Consider optimizing large visualizations to reduce memory usage
115
+
116
+ 2. **Performance**:
117
+ - Implement result caching for common queries
118
+ - Consider pre-computing common graph layouts
119
+
120
+ 3. **Security**:
121
+ - Ensure no sensitive data is stored in the codebase
122
+ - Store all credentials using environment variables or Secrets
123
+
124
+ ## Memory Optimization Tips
125
+
126
+ If you encounter memory issues with large ontologies:
127
+
128
+ 1. Limit the maximum number of nodes in visualization
129
+ 2. Implement pagination for large result sets
130
+ 3. Use streaming responses for large text outputs
131
+ 4. Optimize NetworkX operations for large graphs
132
+
133
+ ## Additional Resources
134
+
135
+ - [Streamlit Deployment Documentation](https://docs.streamlit.io/streamlit-community-cloud/get-started)
136
+ - [Hugging Face Spaces Documentation](https://huggingface.co/docs/hub/spaces)
137
+ - [OpenAI API Documentation](https://platform.openai.com/docs/api-reference)
138
+ - [NetworkX Documentation](https://networkx.org/documentation/stable/)
139
+ - [FAISS Documentation](https://github.com/facebookresearch/faiss/wiki)
README.md CHANGED
@@ -1,13 +1,188 @@
1
- ---
2
- title: Ontology RAG Demo
3
- emoji: 🚀
4
- colorFrom: purple
5
- colorTo: gray
6
- sdk: streamlit
7
- sdk_version: 1.44.1
8
- app_file: app.py
9
- pinned: false
10
- license: mit
11
- ---
12
-
13
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Enhanced Ontology-RAG System
2
+ ontology-rag/
3
+ ├── .streamlit/
4
+ │ └── config.toml # Streamlit configuration
5
+ ├── data/
6
+ │ └── enterprise_ontology.json # Enterprise ontology data
7
+ │ └── enterprise_ontology.txt # Simplified text representation of enterprise ontology
8
+ ├── src/
9
+ │ ├── __init__.py
10
+ │ ├── knowledge_graph.py # Knowledge graph processing
11
+ │ ├── ontology_manager.py # Ontology management
12
+ │ ├── semantic_retriever.py # Semantic retrieval
13
+ │ └── visualization.py # Visualization function
14
+ ├── static/
15
+ │ └── css/
16
+ │ └── styles.css # Custom styles
17
+ ├── app.py # Main application
18
+ ├── requirements.txt # Dependency list
19
+ ├── README.md # Project descriptio
20
+ └── huggingface.yml # Hugging Face Space configuration
21
+
22
+
23
+ ## Project Overview
24
+
25
+ This repository contains an advanced Retrieval-Augmented Generation (RAG) system that integrates structured ontologies with language models. The system demonstrates how formal ontological knowledge representation can enhance traditional vector-based retrieval methods to provide more accurate, contextually rich, and logically consistent answers to user queries.
26
+
27
+ The project implements a sophisticated architecture that combines:
28
+
29
+ - JSON-based ontology representation with classes, relationships, rules, and instances
30
+ - Knowledge graph visualization for exploring entity relationships
31
+ - Semantic path finding for multi-hop reasoning between concepts
32
+ - Comparative analysis between traditional vector-based RAG and ontology-enhanced RAG
33
+
34
+ The application is built with **Streamlit** for the frontend interface, uses **FAISS** for vector embeddings, **NetworkX** for graph representation, and integrates with **OpenAI's language models** for generating responses.
35
+
36
+ ## Key Features
37
+
38
+ 1. **RAG Comparison Demo**
39
+ - Side-by-side comparison of traditional and ontology-enhanced RAG
40
+ - Analysis of differences in answers and retrieved context
41
+
42
+ 2. **Knowledge Graph Visualization**
43
+ - Interactive network graph for exploring the ontology structure
44
+ - Multiple layout algorithms (force-directed, hierarchical, radial, circular)
45
+ - Entity relationship exploration with customizable focus
46
+
47
+ 3. **Ontology Structure Analysis**
48
+ - Visualization of class hierarchies and statistics
49
+ - Relationship usage and domain-range distribution analysis
50
+ - Graph statistics including node counts, edge counts, and centrality metrics
51
+
52
+ 4. **Entity Exploration**
53
+ - Detailed entity information cards showing properties and relationships
54
+ - Relationship graphs centered on specific entities
55
+ - Neighborhood exploration for entities
56
+
57
+ 5. **Semantic Path Visualization**
58
+ - Path visualization between entities with step-by-step explanation
59
+ - Visual representation of paths through the knowledge graph
60
+ - Connection to relevant business rules
61
+
62
+ 6. **Reasoning Trace Visualization**
63
+ - Query analysis with entity and relationship detection
64
+ - Sankey diagrams showing information flow in the RAG process
65
+ - Explanation of reasoning steps
66
+
67
+ ## Ontology Structure Example
68
+
69
+ The `data/enterprise_ontology.json` file contains a rich enterprise ontology that models organizational knowledge. Here's a breakdown of its key components:
70
+
71
+ ### Classes (Entity Types)
72
+
73
+ The ontology defines a hierarchical class structure with inheritance relationships. For example:
74
+
75
+ - **Entity** (base class)
76
+ - **FinancialEntity** → Budget, Revenue, Expense
77
+ - **Asset** → PhysicalAsset, DigitalAsset, IntellectualProperty
78
+ - **Person** → InternalPerson → Employee → Manager
79
+ - **Process** → BusinessProcess, DevelopmentProcess, SupportProcess
80
+ - **Market** → GeographicMarket, DemographicMarket, BusinessMarket
81
+
82
+ Each class has a description and a set of defined properties. For instance, the `Employee` class includes properties like role, hire date, and performance rating.
83
+
84
+ ### Relationships
85
+
86
+ The ontology defines explicit relationships between entity types, including:
87
+
88
+ - `ownedBy`: Connects Product to Department
89
+ - `managedBy`: Connects Department to Manager
90
+ - `worksOn`: Connects Employee to Product
91
+ - `purchases`: Connects Customer to Product
92
+ - `provides`: Connects Customer to Feedback
93
+ - `optimizedBy`: Relates Product to Feedback
94
+
95
+ Each relationship has metadata such as domain, range, cardinality, and inverse relationship name.
96
+
97
+ ### Business Rules
98
+
99
+ The ontology contains formal business rules that constrain the knowledge model:
100
+
101
+ - "Every Product must be owned by exactly one Department"
102
+ - "Every Department must be managed by exactly one Manager"
103
+ - "Critical support tickets must be assigned to Senior employees or managers"
104
+ - "Product Lifecycle stages must follow a predefined sequence"
105
+
106
+ ### Instances
107
+
108
+ The ontology includes concrete instances of the defined classes, such as:
109
+
110
+ - `product1`: An "Enterprise Analytics Suite" owned by the Engineering department
111
+ - `manager1`: A director named "Jane Smith" who manages the Engineering department
112
+ - `customer1`: "Acme Corp" who has purchased product1 and provided feedback
113
+
114
+ Each instance has properties and relationships to other instances, forming a connected knowledge graph.
115
+
116
+ This structured knowledge representation allows the system to perform semantic reasoning beyond what would be possible with simple text-based approaches, enabling it to answer complex queries that require understanding of hierarchical relationships, business rules, and multi-step connections between entities.
117
+
118
+ ## Getting Started
119
+
120
+ ### Prerequisites
121
+
122
+ - Python 3.8+
123
+ - OpenAI API key
124
+
125
+ ### Installation
126
+
127
+ 1. Clone this repository
128
+ 2. Install the required dependencies:
129
+ ```
130
+ pip install -r requirements.txt
131
+ ```
132
+ 3. Set up your OpenAI API key as an environment variable or in the Streamlit secrets
133
+
134
+ ### Running the Application
135
+
136
+ To run the application locally:
137
+
138
+ ```
139
+ streamlit run app.py
140
+ ```
141
+
142
+ For deployment instructions, please refer to the [DEPLOYMENT_GUIDE.md](DEPLOYMENT_GUIDE.md).
143
+
144
+ ## Project Structure
145
+
146
+ ```
147
+ ontology-rag/
148
+ ├── .streamlit/
149
+ │ └── config.toml # Streamlit configuration
150
+ ├── data/
151
+ │ └── enterprise_ontology.json # Enterprise ontology data
152
+ │ └── enterprise_ontology.txt # Simplified text representation of ontology
153
+ ├── src/
154
+ │ ├── __init__.py
155
+ │ ├── knowledge_graph.py # Knowledge graph processing
156
+ │ ├── ontology_manager.py # Ontology management
157
+ │ ├── semantic_retriever.py # Semantic retrieval
158
+ │ └── visualization.py # Visualization functions
159
+ ├── static/
160
+ │ └── css/
161
+ │ └── styles.css # Custom styles
162
+ ├── app.py # Main application
163
+ ├── requirements.txt # Dependencies list
164
+ ├── DEPLOYMENT_GUIDE.md # Deployment instructions
165
+ └── README.md # This file
166
+ ```
167
+
168
+ ## Use Cases
169
+
170
+ ### Enterprise Knowledge Management
171
+ The ontology-enhanced RAG system can help organizations effectively organize and access their knowledge assets, connecting information across different departments and systems to provide more comprehensive business insights.
172
+
173
+ ### Product Development Decision Support
174
+ By understanding the relationships between customer feedback, product features, and market data, the system can provide more valuable support for product development decisions.
175
+
176
+ ### Complex Compliance Queries
177
+ In compliance scenarios where multiple rules and relationships need to be considered, the ontology-enhanced RAG can provide rule-based reasoning to ensure recommendations comply with all applicable policies and regulations.
178
+
179
+ ### Diagnostics and Troubleshooting
180
+ In technical support and troubleshooting scenarios, the system can connect symptoms, causes, and solutions through multi-hop reasoning to provide more accurate diagnoses.
181
+
182
+ ## Acknowledgments
183
+
184
+ This project demonstrates the integration of ontological knowledge with RAG systems for enhanced query answering capabilities. It builds upon research in knowledge graphs, semantic web technologies, and large language models.
185
+
186
+ ## License
187
+
188
+ This project is licensed under the MIT License - see the license file for details.
app.py ADDED
@@ -0,0 +1,683 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import streamlit as st
2
+ st.set_page_config(page_title="Ontology RAG Demo", layout="wide")
3
+
4
+ import os
5
+ from src.semantic_retriever import SemanticRetriever
6
+ from src.ontology_manager import OntologyManager
7
+ from src.knowledge_graph import KnowledgeGraph
8
+ from src.visualization import (display_ontology_stats, display_entity_details,
9
+ display_graph_visualization, visualize_path,
10
+ display_reasoning_trace, render_html_in_streamlit)
11
+ import networkx as nx
12
+ from openai import OpenAI
13
+ import json
14
+
15
+ # Setup
16
+ llm = OpenAI(api_key=st.secrets["OPENAI_API_KEY"])
17
+ ontology_manager = OntologyManager("data/enterprise_ontology.json")
18
+ semantic_retriever = SemanticRetriever(ontology_manager=ontology_manager)
19
+ knowledge_graph = KnowledgeGraph(ontology_manager=ontology_manager)
20
+ k_val = st.sidebar.slider("Top K Results", 1, 10, 3)
21
+
22
+ def main():
23
+ # Page Navigation
24
+ st.sidebar.title("Page Navigation")
25
+ page = st.sidebar.selectbox(
26
+ "Select function",
27
+ ["RAG comparison demonstration", "Knowledge graph visualization", "Ontology structure analysis", "Entity exploration", "Semantic path visualization", "Inference tracking", "Detailed comparative analysis"]
28
+ )
29
+
30
+ if page == "RAG Comparison Demo":
31
+ run_rag_demo()
32
+ elif page == "Knowledge Graph Visualization":
33
+ run_knowledge_graph_visualization()
34
+ elif page == "Ontology Structure Analysis":
35
+ run_ontology_structure_analysis()
36
+ elif page == "Entity Exploration":
37
+ run_entity_exploration()
38
+ elif page == "Semantic Path Visualization":
39
+ run_semantic_path_visualization()
40
+ elif page == "Inference Tracking":
41
+ run_reasoning_trace()
42
+ elif page == "Detailed comparative analysis":
43
+ run_detailed_comparison()
44
+
45
+ def run_rag_demo():
46
+ st.title("Ontology Enhanced RAG Demonstration")
47
+
48
+ query = st.text_input(
49
+ "Enter a question to compare RAG methods:",
50
+ "How does customer feedback influence product development?"
51
+ )
52
+
53
+ if query:
54
+ col1, col2 = st.columns(2)
55
+
56
+ with st.spinner("Run two RAG methods..."):
57
+ # Traditional RAG
58
+ with col1:
59
+ st.subheader("Traditional RAG")
60
+ vector_docs = semantic_retriever.vector_store.similarity_search(query, k=k_val)
61
+ vector_context = "\n\n".join([doc.page_content for doc in vector_docs])
62
+ vector_messages = [
63
+ {"role": "system", "content": f"You are an enterprise knowledge assistant...\nContext:\n{vector_context}"},
64
+ {"role": "user", "content": query}
65
+ ]
66
+ vector_response = llm.chat.completions.create(
67
+ model="gpt-3.5-turbo",
68
+ messages=vector_messages
69
+ )
70
+ vector_answer = vector_response.choices[0].message.content
71
+
72
+ st.markdown("#### answer")
73
+ st.write(vector_answer)
74
+
75
+ st.markdown("#### retrieval context")
76
+ for i, doc in enumerate(vector_docs):
77
+ with st.expander(f"Source {i+1}"):
78
+ st.code(doc.page_content)
79
+
80
+ # # Ontology RAG
81
+ with col2:
82
+ st.subheader("Ontology RAG")
83
+ result = semantic_retriever.retrieve_with_paths(query, k=k_val)
84
+ retrieved_docs = result["documents"]
85
+ enhanced_context = "\n\n".join([doc.page_content for doc in retrieved_docs])
86
+ enhanced_messages = [
87
+ {"role": "system", "content": f"You are an enterprise knowledge assistant with ontology access rights...\nContext:\n{enhanced_context}"},
88
+ {"role": "user", "content": query}
89
+ ]
90
+ enhanced_response = llm.chat.completions.create(
91
+ model="gpt-3.5-turbo",
92
+ messages=enhanced_messages
93
+ )
94
+ enhanced_answer = enhanced_response.choices[0].message.content
95
+
96
+ st.markdown("#### answer")
97
+ st.write(enhanced_answer)
98
+
99
+ st.markdown("#### Search context")
100
+ for i, doc in enumerate(retrieved_docs):
101
+ source = doc.metadata.get("source", "unknown")
102
+ label = {
103
+ "ontology": "Ontology context",
104
+ "text": "Text context",
105
+ "ontology_context": "Semantic context",
106
+ "semantic_path": "Relationship path"
107
+ }.get(source, f"source")
108
+ with st.expander(f"{label} {i+1}"):
109
+ st.markdown(doc.page_content)
110
+
111
+ # Store for reasoning trace visualization
112
+ st.session_state.query = query
113
+ st.session_state.retrieved_docs = retrieved_docs
114
+ st.session_state.answer = enhanced_answer
115
+
116
+ # Difference Analysis
117
+ st.markdown("---")
118
+ st.subheader("Difference Analysis")
119
+
120
+ st.markdown("""
121
+ The above comparison demonstrates several key advantages of ontology-enhanced RAG:
122
+
123
+ 1. **Structure-aware**: Ontology-augmented methods understand the relationships between entities, not just their textual similarities.
124
+
125
+ 2. **Multi-hop reasoning**: By using the knowledge graph structure, the enhancement method can connect information across multiple relational jumps.
126
+
127
+ 3. **Context enrichment**: Ontologies provide additional context about entity types, attributes, and relationships that are not explicit in the text.
128
+
129
+ 4. Reasoning ability: Structured knowledge allows for logical reasoning that vector similarity alone cannot achieve.
130
+
131
+ Try more complex queries that require understanding of relationships to see the differences more clearly!
132
+ """)
133
+
134
+ def run_knowledge_graph_visualization():
135
+ st.title("Knowledge Graph Visualization")
136
+
137
+ # Check if there is a center entity selected
138
+ central_entity = st.session_state.get('central_entity', None)
139
+
140
+ # Check if there is a center entity selected
141
+ display_graph_visualization(knowledge_graph, central_entity=central_entity, max_distance=2)
142
+
143
+ # Get and display graphical statistics
144
+ graph_stats = knowledge_graph.get_graph_statistics()
145
+ if graph_stats:
146
+ st.subheader("Graphical Statistics")
147
+
148
+ col1, col2, col3, col4 = st.columns(4)
149
+ col1.metric("Total number of nodes", graph_stats.get("node_count", 0))
150
+ col2.metric("Total number of edges", graph_stats.get("edge_count", 0))
151
+ col3.metric("total number of classes", graph_stats.get("class_count", 0))
152
+ col4.metric("Total number of instances", graph_stats.get("instance_count", 0))
153
+
154
+ # Display the central node
155
+ if "central_nodes" in graph_stats and graph_stats["central_nodes"]:
156
+ st.subheader("Central Nodes (by Betweenness Centrality)")
157
+ central_nodes = graph_stats["central_nodes"]["betweenness"]
158
+ nodes_df = []
159
+ for node_info in central_nodes:
160
+ node_id = node_info["node"]
161
+ node_data = knowledge_graph.graph.nodes.get(node_id, {})
162
+ node_type = node_data.get("type", "unknown")
163
+ if node_type == "instance":
164
+ node_class = node_data.get("class_type", "unknown")
165
+ properties = node_data.get("properties", {})
166
+ name = properties.get("name", node_id)
167
+ nodes_df.append({
168
+ "ID": node_id,
169
+ "Name": name,
170
+ "type": node_class,
171
+ "Centrality": node_info["centrality"]
172
+ })
173
+
174
+ st.table(nodes_df)
175
+
176
+ def run_ontology_structure_analysis():
177
+ st.title("Ontology Structure Analysis")
178
+
179
+ # Use the existing ontology statistics display function
180
+ display_ontology_stats(ontology_manager)
181
+
182
+ # Add additional class hierarchy visualization
183
+ st.subheader("class hierarchy")
184
+
185
+ # Get class hierarchy data
186
+ class_hierarchy = ontology_manager.get_class_hierarchy()
187
+
188
+ # Create a NetworkX graph to represent the class hierarchy
189
+ G = nx.DiGraph()
190
+
191
+ # Add nodes and edges
192
+ for parent, children in class_hierarchy.items():
193
+ if not G.has_node(parent):
194
+ G.add_node(parent)
195
+ for child in children:
196
+ G.add_node(child)
197
+ G.add_edge(parent, child)
198
+
199
+ # Check if there are enough nodes to create the visualization
200
+ if len(G.nodes) > 1:
201
+ # Generate HTML visualization using knowledge graph class
202
+ kg = KnowledgeGraph(ontology_manager)
203
+ html = kg.generate_html_visualization(
204
+ include_classes=True,
205
+ include_instances=False,
206
+ max_distance=5,
207
+ layout_algorithm="hierarchical"
208
+ )
209
+
210
+ # Rendering HTML
211
+ render_html_in_streamlit(html)
212
+
213
+ def run_entity_exploration():
214
+ st.title("Entity Exploration")
215
+
216
+ # Get all entities
217
+ entities = []
218
+ for class_name in ontology_manager.get_classes():
219
+ entities.extend(ontology_manager.get_instances_of_class(class_name))
220
+
221
+ # Remove duplicates and sort
222
+ entities = sorted(set(entities))
223
+
224
+ # Create a drop-down selection box
225
+ selected_entity = st.selectbox("Select entity", entities)
226
+
227
+ if selected_entity:
228
+ # Get entity information
229
+ entity_info = ontology_manager.get_entity_info(selected_entity)
230
+
231
+ # Display detailed information
232
+ display_entity_details(entity_info, ontology_manager)
233
+
234
+ # Set this entity as the central entity (for knowledge graph visualization)
235
+ if st.button("View this entity in the knowledge graph"):
236
+ st.session_state.central_entity = selected_entity
237
+ st.rerun()
238
+
239
+ # Get and display entity neighbors
240
+ st.subheader("Entity Neighborhood")
241
+ max_distance = st.slider("Maximum neighborhood distance", 1, 3, 1)
242
+
243
+ neighborhood = knowledge_graph.get_entity_neighborhood(
244
+ selected_entity,
245
+ max_distance=max_distance,
246
+ include_classes=True
247
+ )
248
+
249
+ if neighborhood and "neighbors" in neighborhood:
250
+ # Display neighbors grouped by distance
251
+ for distance in range(1, max_distance+1):
252
+ neighbors_at_distance = [n for n in neighborhood["neighbors"] if n["distance"] == distance]
253
+
254
+ if neighbors_at_distance:
255
+ with st.expander(f"Neighbors at distance {distance} ({len(neighbors_at_distance)})"):
256
+ for neighbor in neighbors_at_distance:
257
+ st.markdown(f"**{neighbor['id']}** ({neighbor.get('class_type', 'unknown')})")
258
+
259
+ # Display relations
260
+ for relation in neighbor.get("relations", []):
261
+ direction = "→" if relation["direction"] == "outgoing" else "←"
262
+ st.markdown(f"- {direction} {relation['type']}")
263
+
264
+ st.markdown("---")
265
+
266
+ def run_semantic_path_visualization():
267
+ st.title("Semantic Path Visualization")
268
+
269
+ # Get all entities
270
+ entities = []
271
+ for class_name in ontology_manager.get_classes():
272
+ entities.extend(ontology_manager.get_instances_of_class(class_name))
273
+
274
+ # Remove duplicates and sort
275
+ entities = sorted(set(entities))
276
+
277
+ # Create two columns for selecting source and target entities
278
+ col1, col2 = st.columns(2)
279
+
280
+ with col1:
281
+ source_entity = st.selectbox("Select source entity", entities, key="source")
282
+
283
+ with col2:
284
+ target_entity = st.selectbox("Select target entity", entities, key="target")
285
+
286
+ if source_entity and target_entity and source_entity != target_entity:
287
+ # Provide a maximum path length option
288
+ max_length = st.slider("Maximum path length", 1, 5, 3)
289
+
290
+ # Find the path
291
+ paths = knowledge_graph.find_paths_between_entities(
292
+ source_entity,
293
+ target_entity,
294
+ max_length=max_length
295
+ )
296
+
297
+ if paths:
298
+ st.success(f"Found {len(paths)} paths!")
299
+
300
+ # Create expanders for each path
301
+ for i, path in enumerate(paths):
302
+ # Calculate path length and relationship type
303
+ path_length = len(path)
304
+ rel_types = [edge["type"] for edge in path]
305
+
306
+ with st.expander(f"path {i+1} (length: {path_length}, relation: {', '.join(rel_types)})", expanded=(i==0)):
307
+ # Create a text description of the path
308
+ path_text = []
309
+ entities_in_path = []
310
+
311
+ for edge in path:
312
+ source = edge["source"]
313
+ target = edge["target"]
314
+ relation = edge["type"]
315
+
316
+ entities_in_path.append(source)
317
+ entities_in_path.append(target)
318
+
319
+ # Get entity information to get a human-readable name
320
+ source_info = ontology_manager.get_entity_info(source)
321
+ target_info = ontology_manager.get_entity_info(target)
322
+
323
+ source_name = source
324
+ if "properties" in source_info and "name" in source_info["properties"]:
325
+ source_name = source_info["properties"]["name"]
326
+
327
+ target_name = target
328
+ if "properties" in target_info and "name" in target_info["properties"]:
329
+ target_name = target_info["properties"]["name"]
330
+
331
+ path_text.append(f"{source_name} ({source}) **{relation}** {target_name} ({target})")
332
+
333
+ # Display path description
334
+ st.markdown(" → ".join(path_text))
335
+
336
+ # Prepare path visualization
337
+ path_info = {
338
+ "source": source_entity,
339
+ "target": target_entity,
340
+ "path": path,
341
+ "text": " → ".join(path_text)
342
+ }
343
+
344
+ # Display path visualization
345
+ visualize_path(path_info, ontology_manager)
346
+ else:
347
+ st.warning(f"No path of length {max_length} or shorter was found between these entities.")
348
+
349
+ def run_reasoning_trace():
350
+ st.title("Inference Tracking Visualization")
351
+
352
+ if not st.session_state.get("query") or not st.session_state.get("retrieved_docs") or not st.session_state.get("answer"):
353
+ st.warning("Please run a query on the RAG comparison page first to generate inference trace data.")
354
+ return
355
+
356
+ # Get data from session state
357
+ query = st.session_state.query
358
+ retrieved_docs = st.session_state.retrieved_docs
359
+ answer = st.session_state.answer
360
+
361
+ # Show inference trace
362
+ display_reasoning_trace(query, retrieved_docs, answer, ontology_manager)
363
+
364
+ def run_detailed_comparison():
365
+ st.title("Detailed comparison of RAG methods")
366
+
367
+ # Add comparison query options
368
+ comparison_queries = [
369
+ "How does customer feedback influence product development?",
370
+ "Which employees work in the Engineering department?",
371
+ "What are the product life cycle stages?",
372
+ "How do managers monitor employee performance?",
373
+ "What are the responsibilities of the marketing department?"
374
+ ]
375
+
376
+ selected_query = st.selectbox(
377
+ "Select Compare Query",
378
+ comparison_queries,
379
+ index=0
380
+ )
381
+
382
+ custom_query = st.text_input("Or enter a custom query:", "")
383
+
384
+ if custom_query:
385
+ query = custom_query
386
+ else:
387
+ query = selected_query
388
+
389
+ if st.button("Compare RAG methods"):
390
+ with st.spinner("Run detailed comparison..."):
391
+ # Start timing
392
+ import time
393
+ start_time = time.time()
394
+
395
+ # Run traditional RAG
396
+ vector_docs = semantic_retriever.vector_store.similarity_search(query, k=k_val)
397
+ vector_context = "\n\n".join([doc.page_content for doc in vector_docs])
398
+ vector_messages = [
399
+ {"role": "system", "content": f"You are an enterprise knowledge assistant...\nContext:\n{vector_context}"},
400
+ {"role": "user", "content": query}
401
+ ]
402
+ vector_response = llm.chat.completions.create(
403
+ model="gpt-3.5-turbo",
404
+ messages=vector_messages
405
+ )
406
+ vector_answer = vector_response.choices[0].message.content
407
+ vector_time = time.time() - start_time
408
+
409
+ # Reset the timer
410
+ start_time = time.time()
411
+
412
+ # Run the enhanced RAG
413
+ result = semantic_retriever.retrieve_with_paths(query, k=k_val)
414
+ retrieved_docs = result["documents"]
415
+ enhanced_context = "\n\n".join([doc.page_content for doc in retrieved_docs])
416
+ enhanced_messages = [
417
+ {"role": "system", "content": f"You are an enterprise knowledge assistant with ontology access rights...\nContext:\n{enhanced_context}"},
418
+ {"role": "user", "content": query}
419
+ ]
420
+ enhanced_response = llm.chat.completions.create(
421
+ model="gpt-3.5-turbo",
422
+ messages=enhanced_messages
423
+ )
424
+ enhanced_answer = enhanced_response.choices[0].message.content
425
+ enhanced_time = time.time() - start_time
426
+
427
+ # Save the results for visualization
428
+ st.session_state.query = query
429
+ st.session_state.retrieved_docs = retrieved_docs
430
+ st.session_state.answer = enhanced_answer
431
+
432
+ # Display the comparison results
433
+ st.subheader("Comparison results")
434
+
435
+ # Use tabs to show comparisons in different aspects
436
+ tab1, tab2, tab3, tab4 = st.tabs(["Answer Comparison", "Performance Indicators", "Retrieval Source Comparison", "Context Quality"])
437
+
438
+ with tab1:
439
+ col1, col2 = st.columns(2)
440
+
441
+ with col1:
442
+ st.markdown("#### Traditional RAG answer")
443
+ st.write(vector_answer)
444
+
445
+ with col2:
446
+ st.markdown("#### Ontology Enhanced RAG Answer")
447
+ st.write(enhanced_answer)
448
+
449
+ with tab2:
450
+ # Performance Indicators
451
+ col1, col2 = st.columns(2)
452
+
453
+ with col1:
454
+ st.metric("Traditional RAG response time", f"{vector_time:.2f}秒")
455
+
456
+ # Calculate text related indicators
457
+ vector_tokens = len(vector_context.split())
458
+ st.metric("Number of retrieved context tokens", vector_tokens)
459
+
460
+ st.metric("Number of retrieved documents", len(vector_docs))
461
+
462
+ with col2:
463
+ st.metric("Ontology enhanced RAG response time", f"{enhanced_time:.2f}秒")
464
+
465
+ # Calculate text related indicators
466
+ enhanced_tokens = len(enhanced_context.split())
467
+ st.metric("Number of retrieved context tokens", enhanced_tokens)
468
+
469
+ st.metric("Number of retrieved documents", len(retrieved_docs))
470
+
471
+ # Add a chart
472
+ import pandas as pd
473
+ import plotly.express as px
474
+
475
+ # Performance comparison chart
476
+ performance_data = {
477
+ "Metrics": ["Response time (seconds)", "Number of context tags", "Number of retrieved documents"],
478
+ "Traditional RAG": [vector_time, vector_tokens, len(vector_docs)],
479
+ "Ontology Enhanced RAG": [enhanced_time, enhanced_tokens, len(retrieved_docs)]
480
+ }
481
+
482
+ df = pd.DataFrame(performance_data)
483
+
484
+ # Plotly bar chart
485
+ fig = px.bar(
486
+ df,
487
+ x="Indicator",
488
+ y=["Traditional RAG", "Ontology Enhanced RAG"],
489
+ barmode="group",
490
+ title="Performance Index Comparison",
491
+ labels={"value": "Numerical value", "variable": "RAG method"}
492
+ )
493
+
494
+ st.plotly_chart(fig)
495
+
496
+ with tab3:
497
+ # Search source comparison
498
+ traditional_sources = ["Traditional vector retrieval"] * len(vector_docs)
499
+
500
+ enhanced_sources = []
501
+ for doc in retrieved_docs:
502
+ source = doc.metadata.get("source", "unknown")
503
+ label = {
504
+ "ontology": "Ontology context",
505
+ "text": "Text context",
506
+ "ontology_context": "Semantic context",
507
+ "semantic_path": "Relationship path"
508
+ }.get(source, "unknown source")
509
+ enhanced_sources.append(label)
510
+
511
+ # Create a source distribution chart
512
+ source_counts = {}
513
+ for source in enhanced_sources:
514
+ if source in source_counts:
515
+ source_counts[source] += 1
516
+ else:
517
+ source_counts[source] = 1
518
+
519
+ source_df = pd.DataFrame({
520
+ "Source type": list(source_counts.keys()),
521
+ "Number of documents": list(source_counts.values())
522
+ })
523
+
524
+ fig = px.pie(
525
+ source_df,
526
+ values="Number of documents",
527
+ names="Source type",
528
+ title="Ontology-enhanced RAG retrieval source distribution"
529
+ )
530
+
531
+ st.plotly_chart(fig)
532
+
533
+ # Show the relationship between the source and the answer
534
+ st.subheader("Relationship between source and answer")
535
+ st.markdown("""
536
+ Ontology-enhanced methods leverage multiple sources of knowledge to construct more comprehensive answers. The figure above shows the distribution of different sources.
537
+
538
+ In particular, semantic context and relation paths provide knowledge that cannot be captured by traditional vector retrieval, enabling the system to connect concepts and perform multi-hop reasoning.
539
+ """)
540
+
541
+ with tab4:
542
+ # Contextual quality assessment
543
+ st.subheader("Contextual Quality Assessment")
544
+
545
+ # Create an evaluation function (simplified version)
546
+ def evaluate_context(docs):
547
+ metrics = {
548
+ "Direct Relevance": 0,
549
+ "Semantic Richness": 0,
550
+ "Structure Information": 0,
551
+ "Relationship Information": 0
552
+ }
553
+
554
+ for doc in docs:
555
+ content = doc.page_content if hasattr(doc, "page_content") else ""
556
+
557
+ # Direct Relevance - Based on Keywords
558
+ if any(kw in content.lower() for kw in query.lower().split()):
559
+ metrics["direct relevance"] += 1
560
+
561
+ # Semantic richness - based on text length
562
+ metrics["semantic richness"] += min(1, len(content.split()) / 50)
563
+
564
+ # Structural information - from the body
565
+ if hasattr(doc, "metadata") and doc.metadata.get("source") in ["ontology", "ontology_context"]:
566
+ metrics["Structure Information"] += 1
567
+
568
+ # Relationship information - from path
569
+ if hasattr(doc, "metadata") and doc.metadata.get("source") == "semantic_path":
570
+ metrics["relationship information"] += 1
571
+
572
+ # Standardization
573
+ for key in metrics:
574
+ metrics[key] = min(10, metrics[key])
575
+
576
+ return metrics
577
+
578
+ # Evaluate the two methods
579
+ vector_metrics = evaluate_context(vector_docs)
580
+ enhanced_metrics = evaluate_context(retrieved_docs)
581
+
582
+ # Create a comparative radar chart
583
+ metrics_df = pd.DataFrame({
584
+ "metrics": list(vector_metrics.keys()),
585
+ "Traditional RAG": list(vector_metrics.values()),
586
+ "Ontology Enhanced RAG": list(enhanced_metrics.values())
587
+ })
588
+
589
+ # Convert data to Plotly radar chart format
590
+ fig = px.line_polar(
591
+ metrics_df,
592
+ r=["Traditional RAG", "Ontology Enhanced RAG"],
593
+ theta="Indicator",
594
+ line_close=True,
595
+ range_r=[0, 10],
596
+ title="Contextual Quality Comparison"
597
+ )
598
+
599
+ st.plotly_chart(fig)
600
+
601
+ st.markdown("""
602
+ The figure above shows the comparison of the two RAG methods in terms of contextual quality. Ontology-enhanced RAG performs better in multiple dimensions:
603
+
604
+ 1. **Direct relevance**: the degree of relevance between the search content and the query
605
+ 2. **Semantic Richness**: Information density and richness of the retrieval context
606
+ 3. **Structural information**: structured knowledge of entity types, attributes, and relationships
607
+ 4. **Relationship information**: explicit relationships and connection paths between entities
608
+
609
+ The advantage of ontology-enhanced RAG is that it can retrieve structured knowledge and relational information, which are missing in traditional RAG methods.
610
+ """)
611
+
612
+ # Display detailed analysis section
613
+ st.subheader("Method Effect Analysis")
614
+
615
+ with st.expander("Comparison of advantages and disadvantages", expanded=True):
616
+ col1, col2 = st.columns(2)
617
+
618
+ with col1:
619
+ st.markdown("#### Traditional RAG")
620
+ st.markdown("""
621
+ **Advantages**:
622
+ - Simple implementation and light computational burden
623
+ - Works well with unstructured text
624
+ - Response times are usually faster
625
+
626
+ **Disadvantages**:
627
+ - Unable to capture relationships between entities
628
+ - Lack of context for structured knowledge
629
+ - Difficult to perform multi-hop reasoning
630
+ - Retrieval is mainly based on text similarity
631
+ """)
632
+
633
+ with col2:
634
+ st.markdown("#### Ontology Enhanced RAG")
635
+ st.markdown("""
636
+ **Advantages**:
637
+ - Ability to understand relationships and connections between entities
638
+ - Provides rich structured knowledge context
639
+ - Support multi-hop reasoning and path discovery
640
+ - Combining vector similarity and semantic relationship
641
+
642
+ **Disadvantages**:
643
+ - Higher implementation complexity
644
+ - Need to maintain the ontology model
645
+ - The computational overhead is relatively high
646
+ - Retrieval and inference times may be longer
647
+ """)
648
+
649
+ # Add usage scenario suggestions
650
+ with st.expander("Applicable scenarios"):
651
+ st.markdown("""
652
+ ### Traditional RAG applicable scenarios
653
+
654
+ - Simple fact-finding
655
+ - Unstructured document retrieval
656
+ - Applications with high response time requirements
657
+ - When the document content is clear and direct
658
+
659
+ ### Applicable scenarios for Ontology Enhanced RAG
660
+
661
+ - Complex knowledge association query
662
+ - Problems that require understanding of relationships between entities
663
+ - Applications that require cross-domain reasoning
664
+ - Enterprise Knowledge Management System
665
+ - Reasoning scenarios that require high accuracy and consistency
666
+ - Applications that require implicit knowledge discovery
667
+ """)
668
+
669
+ # Add practical application examples
670
+ with st.expander("Actual Application Case"):
671
+ st.markdown("""
672
+ ### Enterprise Knowledge Management
673
+ Ontology-enhanced RAG systems can help enterprises effectively organize and access their knowledge assets, connect information in different departments and systems, and provide more comprehensive business insights.
674
+
675
+ ### Product development decision support
676
+ By understanding the relationship between customer feedback, product features, and market data, the system can provide more valuable support for product development decisions.
677
+
678
+ ### Complex compliance query
679
+ In compliance problems that require consideration of multiple rules and relationships, ontology-enhanced RAG can provide rule-based reasoning, ensuring that recommendations comply with all applicable policies and regulations.
680
+
681
+ ### Diagnostics and Troubleshooting
682
+ In technical support and troubleshooting scenarios, the system can connect symptoms, causes, and solutions to provide more accurate diagnoses through multi-hop reasoning.
683
+ """)
data/enterprise_ontology.json ADDED
@@ -0,0 +1,771 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "rules": [
3
+ {
4
+ "id": "rule9",
5
+ "description": "Critical support tickets must be assigned to Senior employees or managers",
6
+ "constraint": "FORALL ?t WHERE type(?t, SupportTicket) AND property(?t, priority, 'Critical') AND relationship(?t, assignedTo, ?e) MUST type(?e, Manager) OR (type(?e, Employee) AND property(?e, experienceLevel, 'Senior'))"
7
+ },
8
+ {
9
+ "id": "rule10",
10
+ "description": "Project end date must be after its start date",
11
+ "constraint": "FORALL ?p WHERE type(?p, Project) AND property(?p, startDate, ?start) AND property(?p, endDate, ?end) MUST date(?end) > date(?start)"
12
+ }
13
+ ],
14
+ "classes": {
15
+ "FinancialEntity": {
16
+ "description": "An entity related to financial matters",
17
+ "subClassOf": "Entity",
18
+ "properties": ["amount", "currency", "fiscalYear", "quarter", "transactionDate"]
19
+ },
20
+
21
+ "Budget": {
22
+ "description": "A financial plan for a specified period",
23
+ "subClassOf": "FinancialEntity",
24
+ "properties": ["budgetId", "period", "departmentId", "plannedAmount", "actualAmount", "variance"]
25
+ },
26
+
27
+ "Revenue": {
28
+ "description": "Income generated from business activities",
29
+ "subClassOf": "FinancialEntity",
30
+ "properties": ["revenueId", "source", "productId", "recurring", "oneTime", "revenueType"]
31
+ },
32
+
33
+ "Expense": {
34
+ "description": "Cost incurred in business operations",
35
+ "subClassOf": "FinancialEntity",
36
+ "properties": ["expenseId", "category", "department", "approvedBy", "paymentStatus", "receiptUrl"]
37
+ },
38
+
39
+ "Asset": {
40
+ "description": "A resource with economic value",
41
+ "subClassOf": "Entity",
42
+ "properties": ["assetId", "acquisitionDate", "value", "depreciationSchedule", "currentValue", "location"]
43
+ },
44
+
45
+ "PhysicalAsset": {
46
+ "description": "A tangible asset with physical presence",
47
+ "subClassOf": "Asset",
48
+ "properties": ["serialNumber", "manufacturer", "model", "maintenanceSchedule", "condition"]
49
+ },
50
+
51
+ "DigitalAsset": {
52
+ "description": "An intangible digital asset",
53
+ "subClassOf": "Asset",
54
+ "properties": ["fileType", "storageLocation", "accessControl", "backupStatus", "version"]
55
+ },
56
+
57
+ "IntellectualProperty": {
58
+ "description": "Legal rights resulting from intellectual activity",
59
+ "subClassOf": "Asset",
60
+ "properties": ["ipType", "filingDate", "grantDate", "jurisdiction", "inventors", "expirationDate"]
61
+ },
62
+
63
+ "Location": {
64
+ "description": "A physical or virtual place",
65
+ "subClassOf": "Entity",
66
+ "properties": ["locationId", "address", "city", "state", "country", "postalCode", "geoCoordinates"]
67
+ },
68
+
69
+ "Facility": {
70
+ "description": "A physical building or site owned or operated by the organization",
71
+ "subClassOf": "Location",
72
+ "properties": ["facilityType", "squareFootage", "capacity", "operatingHours", "amenities", "securityLevel"]
73
+ },
74
+
75
+ "VirtualLocation": {
76
+ "description": "A digital space or environment",
77
+ "subClassOf": "Location",
78
+ "properties": ["url", "accessMethod", "hostingProvider", "virtualEnvironmentType", "availabilityStatus"]
79
+ },
80
+
81
+ "Market": {
82
+ "description": "A geographic or demographic target for products and services",
83
+ "subClassOf": "Entity",
84
+ "properties": ["marketId", "name", "geography", "demographics", "size", "growth", "competitiveIntensity"]
85
+ },
86
+
87
+ "GeographicMarket": {
88
+ "description": "A market defined by geographic boundaries",
89
+ "subClassOf": "Market",
90
+ "properties": ["region", "countries", "languages", "regulations", "culturalFactors"]
91
+ },
92
+
93
+ "DemographicMarket": {
94
+ "description": "A market defined by demographic characteristics",
95
+ "subClassOf": "Market",
96
+ "properties": ["ageRange", "income", "education", "occupation", "familyStatus", "interests"]
97
+ },
98
+
99
+ "BusinessMarket": {
100
+ "description": "A market consisting of business customers",
101
+ "subClassOf": "Market",
102
+ "properties": ["industryFocus", "companySize", "businessModel", "decisionMakers", "purchasingCriteria"]
103
+ },
104
+
105
+ "Campaign": {
106
+ "description": "A coordinated series of marketing activities",
107
+ "subClassOf": "Entity",
108
+ "properties": ["campaignId", "name", "objective", "startDate", "endDate", "budget", "targetAudience", "channels"]
109
+ },
110
+
111
+ "DigitalCampaign": {
112
+ "description": "A marketing campaign conducted through digital channels",
113
+ "subClassOf": "Campaign",
114
+ "properties": ["platforms", "contentTypes", "keywords", "tracking", "analytics", "automationWorkflows"]
115
+ },
116
+
117
+ "TraditionalCampaign": {
118
+ "description": "A marketing campaign conducted through traditional media",
119
+ "subClassOf": "Campaign",
120
+ "properties": ["mediaTypes", "adSizes", "placementSchedule", "production", "distributionMethod"]
121
+ },
122
+
123
+ "IntegratedCampaign": {
124
+ "description": "A campaign that spans multiple marketing channels",
125
+ "subClassOf": "Campaign",
126
+ "properties": ["channelMix", "messageConsistency", "crossChannelMetrics", "customerJourneyMap"]
127
+ },
128
+
129
+ "Process": {
130
+ "description": "A defined set of activities to accomplish a specific objective",
131
+ "subClassOf": "Entity",
132
+ "properties": ["processId", "name", "purpose", "owner", "inputs", "outputs", "steps", "metrics"]
133
+ },
134
+
135
+ "BusinessProcess": {
136
+ "description": "A process for conducting business operations",
137
+ "subClassOf": "Process",
138
+ "properties": ["businessFunction", "criticality", "maturityLevel", "automationLevel", "regulatoryRequirements"]
139
+ },
140
+
141
+ "DevelopmentProcess": {
142
+ "description": "A process for developing products or services",
143
+ "subClassOf": "Process",
144
+ "properties": ["methodology", "phases", "deliverables", "qualityGates", "tools", "repositories"]
145
+ },
146
+
147
+ "SupportProcess": {
148
+ "description": "A process for supporting customers or internal users",
149
+ "subClassOf": "Process",
150
+ "properties": ["serviceLevel", "escalationPath", "knowledgeBase", "ticketingSystem", "supportHours"]
151
+ },
152
+
153
+ "Skill": {
154
+ "description": "A learned capacity to perform a task",
155
+ "subClassOf": "Entity",
156
+ "properties": ["skillId", "name", "category", "proficiencyLevels", "certifications", "learningResources"]
157
+ },
158
+
159
+ "TechnicalSkill": {
160
+ "description": "A skill related to technology or technical processes",
161
+ "subClassOf": "Skill",
162
+ "properties": ["techCategory", "tools", "languages", "frameworks", "platforms", "compatibility"]
163
+ },
164
+
165
+ "SoftSkill": {
166
+ "description": "An interpersonal or non-technical skill",
167
+ "subClassOf": "Skill",
168
+ "properties": ["interpersonalArea", "communicationAspects", "leadershipComponents", "adaptabilityMetrics"]
169
+ },
170
+
171
+ "DomainSkill": {
172
+ "description": "Knowledge and expertise in a specific business domain",
173
+ "subClassOf": "Skill",
174
+ "properties": ["domain", "industrySpecific", "regulations", "bestPractices", "domainTerminology"]
175
+ },
176
+
177
+ "Objective": {
178
+ "description": "A goal or target to be achieved",
179
+ "subClassOf": "Entity",
180
+ "properties": ["objectiveId", "name", "description", "targetDate", "status", "priority", "owner", "metrics"]
181
+ },
182
+
183
+ "StrategicObjective": {
184
+ "description": "A high-level, long-term goal",
185
+ "subClassOf": "Objective",
186
+ "properties": ["strategyAlignment", "timeframe", "impactAreas", "successIndicators", "boardApproval"]
187
+ },
188
+
189
+ "TacticalObjective": {
190
+ "description": "A medium-term goal supporting strategic objectives",
191
+ "subClassOf": "Objective",
192
+ "properties": ["parentObjective", "implementationPlan", "resourceRequirements", "dependencies", "milestones"]
193
+ },
194
+
195
+ "OperationalObjective": {
196
+ "description": "A short-term, specific goal supporting tactical objectives",
197
+ "subClassOf": "Objective",
198
+ "properties": ["parentTacticalObjective", "assignedTeam", "dailyActivities", "progressTracking", "completionCriteria"]
199
+ },
200
+
201
+ "KPI": {
202
+ "description": "Key Performance Indicator for measuring success",
203
+ "subClassOf": "Entity",
204
+ "properties": ["kpiId", "name", "description", "category", "unit", "formula", "target", "actual", "frequency"]
205
+ },
206
+
207
+ "FinancialKPI": {
208
+ "description": "KPI measuring financial performance",
209
+ "subClassOf": "KPI",
210
+ "properties": ["financialCategory", "accountingStandard", "auditRequirement", "forecastAccuracy"]
211
+ },
212
+
213
+ "CustomerKPI": {
214
+ "description": "KPI measuring customer-related performance",
215
+ "subClassOf": "KPI",
216
+ "properties": ["customerSegment", "touchpoint", "journeyStage", "sentimentConnection", "loyaltyImpact"]
217
+ },
218
+
219
+ "OperationalKPI": {
220
+ "description": "KPI measuring operational efficiency",
221
+ "subClassOf": "KPI",
222
+ "properties": ["processArea", "qualityDimension", "productivityFactor", "resourceUtilization"]
223
+ },
224
+
225
+ "Risk": {
226
+ "description": "A potential event that could negatively impact objectives",
227
+ "subClassOf": "Entity",
228
+ "properties": ["riskId", "name", "description", "category", "probability", "impact", "status", "mitigationPlan"]
229
+ },
230
+
231
+ "FinancialRisk": {
232
+ "description": "Risk related to financial matters",
233
+ "subClassOf": "Risk",
234
+ "properties": ["financialExposure", "currencyFactors", "marketConditions", "hedgingStrategy", "insuranceCoverage"]
235
+ },
236
+
237
+ "OperationalRisk": {
238
+ "description": "Risk related to business operations",
239
+ "subClassOf": "Risk",
240
+ "properties": ["operationalArea", "processVulnerabilities", "systemDependencies", "staffingFactors", "recoveryPlan"]
241
+ },
242
+
243
+ "ComplianceRisk": {
244
+ "description": "Risk related to regulatory compliance",
245
+ "subClassOf": "Risk",
246
+ "properties": ["regulations", "jurisdictions", "reportingRequirements", "penaltyExposure", "complianceStatus"]
247
+ },
248
+
249
+ "Decision": {
250
+ "description": "A choice made between alternatives",
251
+ "subClassOf": "Entity",
252
+ "properties": ["decisionId", "name", "description", "date", "decisionMaker", "alternatives", "selectedOption", "rationale"]
253
+ },
254
+
255
+ "StrategicDecision": {
256
+ "description": "A decision affecting long-term direction",
257
+ "subClassOf": "Decision",
258
+ "properties": ["strategicImplications", "marketPosition", "competitiveAdvantage", "boardApproval", "communicationPlan"]
259
+ },
260
+
261
+ "TacticalDecision": {
262
+ "description": "A decision affecting medium-term operations",
263
+ "subClassOf": "Decision",
264
+ "properties": ["operationalImpact", "resourceAllocation", "implementationTimeline", "departmentalScope"]
265
+ },
266
+
267
+ "OperationalDecision": {
268
+ "description": "A day-to-day decision in business operations",
269
+ "subClassOf": "Decision",
270
+ "properties": ["decisionFrequency", "standardProcedure", "delegationLevel", "auditTrail"]
271
+ },
272
+
273
+ "Technology": {
274
+ "description": "A technical capability or system",
275
+ "subClassOf": "Entity",
276
+ "properties": ["technologyId", "name", "category", "version", "vendor", "maturityLevel", "supportStatus"]
277
+ },
278
+
279
+ "Hardware": {
280
+ "description": "Physical technological equipment",
281
+ "subClassOf": "Technology",
282
+ "properties": ["specifications", "formFactor", "powerRequirements", "connectivity", "lifecycle", "replacementSchedule"]
283
+ },
284
+
285
+ "Software": {
286
+ "description": "Computer programs and applications",
287
+ "subClassOf": "Technology",
288
+ "properties": ["programmingLanguage", "operatingSystem", "architecture", "apiDocumentation", "licensingModel", "updateFrequency"]
289
+ },
290
+
291
+ "Infrastructure": {
292
+ "description": "Foundational technology systems",
293
+ "subClassOf": "Technology",
294
+ "properties": ["deploymentModel", "scalability", "redundancy", "securityFeatures", "complianceCertifications", "capacityMetrics"]
295
+ },
296
+
297
+ "SecurityEntity": {
298
+ "description": "An entity related to security measures",
299
+ "subClassOf": "Entity",
300
+ "properties": ["securityId", "name", "type", "implementationDate", "lastReview", "responsibleParty", "status"]
301
+ },
302
+
303
+ "SecurityControl": {
304
+ "description": "A measure to mitigate security risks",
305
+ "subClassOf": "SecurityEntity",
306
+ "properties": ["controlCategory", "protectedAssets", "implementationLevel", "automationDegree", "verificationMethod", "exceptions"]
307
+ },
308
+
309
+ "SecurityIncident": {
310
+ "description": "An event that compromises security",
311
+ "subClassOf": "SecurityEntity",
312
+ "properties": ["incidentDate", "severity", "affectedSystems", "vector", "remediationSteps", "rootCause", "resolution"]
313
+ },
314
+
315
+ "SecurityPolicy": {
316
+ "description": "A documented security directive",
317
+ "subClassOf": "SecurityEntity",
318
+ "properties": ["policyScope", "requiredControls", "complianceRequirements", "exemptionProcess", "reviewSchedule", "enforcementMechanism"]
319
+ },
320
+
321
+ "Competency": {
322
+ "description": "A cluster of related abilities, knowledge, and skills",
323
+ "subClassOf": "Entity",
324
+ "properties": ["competencyId", "name", "category", "description", "importance", "requiredProficiency", "assessmentMethod"]
325
+ },
326
+
327
+ "ManagerialCompetency": {
328
+ "description": "Competency related to managing people and resources",
329
+ "subClassOf": "Competency",
330
+ "properties": ["leadershipAspects", "teamDevelopment", "decisionMaking", "conflictResolution", "changeManagement", "resourceOptimization"]
331
+ },
332
+
333
+ "TechnicalCompetency": {
334
+ "description": "Competency related to technical knowledge and skills",
335
+ "subClassOf": "Competency",
336
+ "properties": ["technicalDomain", "specializations", "toolProficiency", "problemSolvingApproach", "technicalLeadership", "knowledgeSharing"]
337
+ },
338
+
339
+ "BusinessCompetency": {
340
+ "description": "Competency related to business acumen and operations",
341
+ "subClassOf": "Competency",
342
+ "properties": ["businessAcumen", "industryKnowledge", "stakeholderManagement", "commercialAwareness", "strategicThinking", "resultsOrientation"]
343
+ },
344
+
345
+ "Stakeholder": {
346
+ "description": "An individual or group with interest in or influence over the organization",
347
+ "subClassOf": "Entity",
348
+ "properties": ["stakeholderId", "name", "type", "influence", "interest", "expectations", "engagementLevel", "communicationPreference"]
349
+ },
350
+
351
+ "InternalStakeholder": {
352
+ "description": "A stakeholder within the organization",
353
+ "subClassOf": "Stakeholder",
354
+ "properties": ["department", "role", "decisionAuthority", "projectInvolvement", "changeReadiness", "organizationalTenure"]
355
+ },
356
+
357
+ "ExternalStakeholder": {
358
+ "description": "A stakeholder outside the organization",
359
+ "subClassOf": "Stakeholder",
360
+ "properties": ["organization", "relationship", "contractualAgreements", "marketInfluence", "externalNetworks", "publicProfile"]
361
+ },
362
+
363
+ "RegulatoryStakeholder": {
364
+ "description": "A regulatory body or authority",
365
+ "subClassOf": "Stakeholder",
366
+ "properties": ["jurisdiction", "regulations", "enforcementPowers", "reportingRequirements", "auditFrequency", "complianceDeadlines"]
367
+ }
368
+ },
369
+ "relationships": [
370
+ {
371
+ "name": "ownedBy",
372
+ "domain": "Product",
373
+ "range": "Department",
374
+ "inverse": "owns",
375
+ "cardinality": "many-to-one",
376
+ "description": "Indicates which department owns a product"
377
+ },
378
+ {
379
+ "name": "managedBy",
380
+ "domain": "Department",
381
+ "range": "Manager",
382
+ "inverse": "manages",
383
+ "cardinality": "one-to-one",
384
+ "description": "Indicates which manager heads a department"
385
+ },
386
+ {
387
+ "name": "worksOn",
388
+ "domain": "Employee",
389
+ "range": "Product",
390
+ "inverse": "developedBy",
391
+ "cardinality": "many-to-many",
392
+ "description": "Indicates which products an employee works on"
393
+ },
394
+ {
395
+ "name": "purchases",
396
+ "domain": "Customer",
397
+ "range": "Product",
398
+ "inverse": "purchasedBy",
399
+ "cardinality": "many-to-many",
400
+ "description": "Indicates which products a customer has purchased"
401
+ },
402
+ {
403
+ "name": "provides",
404
+ "domain": "Customer",
405
+ "range": "Feedback",
406
+ "inverse": "providedBy",
407
+ "cardinality": "one-to-many",
408
+ "description": "Connects customers to their feedback submissions"
409
+ },
410
+ {
411
+ "name": "pertainsTo",
412
+ "domain": "Feedback",
413
+ "range": "Product",
414
+ "inverse": "hasFeedback",
415
+ "cardinality": "many-to-one",
416
+ "description": "Indicates which product a feedback item is about"
417
+ },
418
+ {
419
+ "name": "supports",
420
+ "domain": "Platform",
421
+ "range": "Product",
422
+ "inverse": "supportedBy",
423
+ "cardinality": "one-to-many",
424
+ "description": "Indicates which products are supported by the platform"
425
+ },
426
+ {
427
+ "name": "hasLifecycle",
428
+ "domain": "Product",
429
+ "range": "Lifecycle",
430
+ "inverse": "lifecycleOf",
431
+ "cardinality": "one-to-one",
432
+ "description": "Connects a product to its lifecycle information"
433
+ },
434
+ {
435
+ "name": "oversees",
436
+ "domain": "Manager",
437
+ "range": "Employee",
438
+ "inverse": "reportsToDirect",
439
+ "cardinality": "one-to-many",
440
+ "description": "Indicates which employees report to a manager"
441
+ },
442
+ {
443
+ "name": "optimizedBy",
444
+ "domain": "Product",
445
+ "range": "Feedback",
446
+ "inverse": "optimizes",
447
+ "cardinality": "many-to-many",
448
+ "description": "Indicates how feedback is used to optimize product development"
449
+ },
450
+ {
451
+ "name": "allocatesTo",
452
+ "domain": "Budget",
453
+ "range": "Department",
454
+ "inverse": "fundedBy",
455
+ "cardinality": "one-to-many",
456
+ "description": "Indicates which departments receive budget allocations"
457
+ },
458
+ {
459
+ "name": "generatesRevenue",
460
+ "domain": "Product",
461
+ "range": "Revenue",
462
+ "inverse": "generatedFrom",
463
+ "cardinality": "one-to-many",
464
+ "description": "Connects products to the revenue they generate"
465
+ },
466
+ {
467
+ "name": "incursExpense",
468
+ "domain": "Department",
469
+ "range": "Expense",
470
+ "inverse": "incurredBy",
471
+ "cardinality": "one-to-many",
472
+ "description": "Connects departments to their expenses"
473
+ },
474
+ {
475
+ "name": "locatedAt",
476
+ "domain": "PhysicalEntity",
477
+ "range": "Location",
478
+ "inverse": "houses",
479
+ "cardinality": "many-to-one",
480
+ "description": "Indicates where a physical entity is located"
481
+ },
482
+ {
483
+ "name": "targetedAt",
484
+ "domain": "Campaign",
485
+ "range": "Market",
486
+ "inverse": "targetedBy",
487
+ "cardinality": "many-to-many",
488
+ "description": "Indicates which markets a campaign targets"
489
+ },
490
+ {
491
+ "name": "follows",
492
+ "domain": "Project",
493
+ "range": "Process",
494
+ "inverse": "implementedBy",
495
+ "cardinality": "many-to-one",
496
+ "description": "Indicates which process a project follows"
497
+ },
498
+ {
499
+ "name": "requires",
500
+ "domain": "Role",
501
+ "range": "Skill",
502
+ "inverse": "requiredFor",
503
+ "cardinality": "many-to-many",
504
+ "description": "Indicates which skills are required for a role"
505
+ },
506
+ {
507
+ "name": "possesses",
508
+ "domain": "Employee",
509
+ "range": "Skill",
510
+ "inverse": "possessedBy",
511
+ "cardinality": "many-to-many",
512
+ "description": "Indicates which skills an employee possesses"
513
+ },
514
+ {
515
+ "name": "measures",
516
+ "domain": "KPI",
517
+ "range": "Objective",
518
+ "inverse": "measuredBy",
519
+ "cardinality": "many-to-many",
520
+ "description": "Indicates which objectives a KPI measures"
521
+ },
522
+ {
523
+ "name": "affects",
524
+ "domain": "Risk",
525
+ "range": "Entity",
526
+ "inverse": "affectedBy",
527
+ "cardinality": "many-to-many",
528
+ "description": "Indicates which entities are affected by a risk"
529
+ },
530
+ {
531
+ "name": "mitigates",
532
+ "domain": "SecurityControl",
533
+ "range": "Risk",
534
+ "inverse": "mitigatedBy",
535
+ "cardinality": "many-to-many",
536
+ "description": "Indicates which risks are mitigated by a security control"
537
+ },
538
+ {
539
+ "name": "demonstrates",
540
+ "domain": "Employee",
541
+ "range": "Competency",
542
+ "inverse": "demonstratedBy",
543
+ "cardinality": "many-to-many",
544
+ "description": "Indicates which competencies an employee demonstrates"
545
+ },
546
+ {
547
+ "name": "influencedBy",
548
+ "domain": "Decision",
549
+ "range": "Stakeholder",
550
+ "inverse": "influences",
551
+ "cardinality": "many-to-many",
552
+ "description": "Indicates which stakeholders influence a decision"
553
+ },
554
+ {
555
+ "name": "implementedWith",
556
+ "domain": "Process",
557
+ "range": "Technology",
558
+ "inverse": "supports",
559
+ "cardinality": "many-to-many",
560
+ "description": "Indicates which technologies support a process"
561
+ }
562
+ ],
563
+ "instances": [
564
+ {
565
+ "id": "product1",
566
+ "type": "Product",
567
+ "properties": {
568
+ "name": "Enterprise Analytics Suite",
569
+ "version": "2.1",
570
+ "status": "Active"
571
+ },
572
+ "relationships": [
573
+ {"type": "ownedBy", "target": "dept1"},
574
+ {"type": "hasLifecycle", "target": "lifecycle1"},
575
+ {"type": "optimizedBy", "target": "feedback1"}
576
+ ]
577
+ },
578
+ {
579
+ "id": "product2",
580
+ "type": "Product",
581
+ "properties": {
582
+ "name": "Customer Portal",
583
+ "version": "1.5",
584
+ "status": "Active"
585
+ },
586
+ "relationships": [
587
+ {"type": "ownedBy", "target": "dept2"},
588
+ {"type": "hasLifecycle", "target": "lifecycle2"},
589
+ {"type": "optimizedBy", "target": "feedback2"}
590
+ ]
591
+ },
592
+ {
593
+ "id": "dept1",
594
+ "type": "Department",
595
+ "properties": {
596
+ "name": "Engineering",
597
+ "function": "Product Development"
598
+ },
599
+ "relationships": [
600
+ {"type": "managedBy", "target": "manager1"},
601
+ {"type": "owns", "target": "product1"}
602
+ ]
603
+ },
604
+ {
605
+ "id": "dept2",
606
+ "type": "Department",
607
+ "properties": {
608
+ "name": "Marketing",
609
+ "function": "Customer Engagement"
610
+ },
611
+ "relationships": [
612
+ {"type": "managedBy", "target": "manager2"},
613
+ {"type": "owns", "target": "product2"}
614
+ ]
615
+ },
616
+ {
617
+ "id": "manager1",
618
+ "type": "Manager",
619
+ "properties": {
620
+ "name": "Jane Smith",
621
+ "role": "Engineering Director",
622
+ "managementLevel": "Director"
623
+ },
624
+ "relationships": [
625
+ {"type": "oversees", "target": "employee1"},
626
+ {"type": "oversees", "target": "employee2"},
627
+ {"type": "manages", "target": "dept1"}
628
+ ]
629
+ },
630
+ {
631
+ "id": "manager2",
632
+ "type": "Manager",
633
+ "properties": {
634
+ "name": "Michael Chen",
635
+ "role": "Marketing Manager",
636
+ "managementLevel": "Manager"
637
+ },
638
+ "relationships": [
639
+ {"type": "oversees", "target": "employee3"},
640
+ {"type": "manages", "target": "dept2"}
641
+ ]
642
+ },
643
+ {
644
+ "id": "employee1",
645
+ "type": "Employee",
646
+ "properties": {
647
+ "name": "John Doe",
648
+ "role": "Senior Developer"
649
+ },
650
+ "relationships": [
651
+ {"type": "worksOn", "target": "product1"},
652
+ {"type": "reportsToDirect", "target": "manager1"}
653
+ ]
654
+ },
655
+ {
656
+ "id": "employee2",
657
+ "type": "Employee",
658
+ "properties": {
659
+ "name": "Sarah Johnson",
660
+ "role": "QA Engineer"
661
+ },
662
+ "relationships": [
663
+ {"type": "worksOn", "target": "product1"},
664
+ {"type": "reportsToDirect", "target": "manager1"}
665
+ ]
666
+ },
667
+ {
668
+ "id": "employee3",
669
+ "type": "Employee",
670
+ "properties": {
671
+ "name": "David Wilson",
672
+ "role": "Marketing Specialist"
673
+ },
674
+ "relationships": [
675
+ {"type": "worksOn", "target": "product2"},
676
+ {"type": "reportsToDirect", "target": "manager2"}
677
+ ]
678
+ },
679
+ {
680
+ "id": "customer1",
681
+ "type": "Customer",
682
+ "properties": {
683
+ "name": "Acme Corp",
684
+ "customerSince": "2020-05-15"
685
+ },
686
+ "relationships": [
687
+ {"type": "purchases", "target": "product1"},
688
+ {"type": "provides", "target": "feedback1"}
689
+ ]
690
+ },
691
+ {
692
+ "id": "customer2",
693
+ "type": "Customer",
694
+ "properties": {
695
+ "name": "GlobalTech",
696
+ "customerSince": "2021-03-22"
697
+ },
698
+ "relationships": [
699
+ {"type": "purchases", "target": "product2"},
700
+ {"type": "provides", "target": "feedback2"}
701
+ ]
702
+ },
703
+ {
704
+ "id": "feedback1",
705
+ "type": "Feedback",
706
+ "properties": {
707
+ "date": "2023-09-10",
708
+ "sentiment": "Positive",
709
+ "rating": 4.5,
710
+ "content": "The analytics dashboard is very intuitive and provides excellent insights.",
711
+ "suggestions": "Would like to see more export options."
712
+ },
713
+ "relationships": [
714
+ {"type": "providedBy", "target": "customer1"},
715
+ {"type": "pertainsTo", "target": "product1"},
716
+ {"type": "optimizes", "target": "product1"}
717
+ ]
718
+ },
719
+ {
720
+ "id": "feedback2",
721
+ "type": "Feedback",
722
+ "properties": {
723
+ "date": "2023-10-05",
724
+ "sentiment": "Mixed",
725
+ "rating": 3.0,
726
+ "content": "The portal is functional but navigation could be improved.",
727
+ "suggestions": "Add better navigation and mobile support."
728
+ },
729
+ "relationships": [
730
+ {"type": "providedBy", "target": "customer2"},
731
+ {"type": "pertainsTo", "target": "product2"},
732
+ {"type": "optimizes", "target": "product2"}
733
+ ]
734
+ },
735
+ {
736
+ "id": "lifecycle1",
737
+ "type": "Lifecycle",
738
+ "properties": {
739
+ "currentStage": "Maintenance",
740
+ "previousStages": ["Development", "Launch"]
741
+ },
742
+ "relationships": [
743
+ {"type": "lifecycleOf", "target": "product1"}
744
+ ]
745
+ },
746
+ {
747
+ "id": "lifecycle2",
748
+ "type": "Lifecycle",
749
+ "properties": {
750
+ "currentStage": "Growth",
751
+ "previousStages": ["Development", "Launch"]
752
+ },
753
+ "relationships": [
754
+ {"type": "lifecycleOf", "target": "product2"}
755
+ ]
756
+ },
757
+ {
758
+ "id": "platform1",
759
+ "type": "Platform",
760
+ "properties": {
761
+ "name": "Product Management System",
762
+ "version": "3.0",
763
+ "capabilities": ["Tracking", "Versioning", "Ownership Management"]
764
+ },
765
+ "relationships": [
766
+ {"type": "supports", "target": "product1"},
767
+ {"type": "supports", "target": "product2"}
768
+ ]
769
+ }
770
+ ]
771
+ }
data/enterprise_ontology.txt ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ Product is owned by Department.
2
+ Department is managed by Manager.
3
+ Employee works on Product.
4
+ Customer purchases Product and provides Feedback.
5
+ Platform supports Product tracking, versioning, and ownership.
6
+ Each Product has an associated Lifecycle.
7
+ Product Lifecycle includes stages like development, launch, maintenance, and retirement.
8
+ Manager oversees Employee performance and departmental goals.
9
+ Feedback includes sentiment, rating, and suggestions.
10
+ Platform uses AI agents to optimize Product development based on Feedback trends.
huggingface.yml ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ title: Ontology RAG Demo
2
+ colorFrom: indigo
3
+ colorTo: blue
4
+ sdk: streamlit
5
+ sdk_version: 1.44.0
6
+ app_file: app.py
7
+ pinned: true
8
+ license: mit
requirements.txt ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ streamlit>=1.44.0
2
+ openai>=1.2.0
3
+ langchain>=0.1.13
4
+ langchain-community>=0.0.21
5
+ langchain-openai>=0.0.5
6
+ faiss-cpu>=1.7.4
7
+ networkx>=3.1
8
+ pyvis>=0.3.2
9
+ plotly>=5.15.0
10
+ pandas>=2.0.0
11
+ matplotlib>=3.7.1
12
+ numpy>=1.24.3
13
+ pygraphviz>=1.10 # May require system dependencies, optional
14
+ pydantic>=1.10.8
src/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ # Package initialization
src/knowledge_graph.py ADDED
@@ -0,0 +1,920 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # src/knowledge_graph.py
2
+
3
+ import networkx as nx
4
+ from pyvis.network import Network
5
+ import json
6
+ from typing import Dict, List, Any, Optional, Set, Tuple
7
+ import matplotlib.pyplot as plt
8
+ import matplotlib.colors as mcolors
9
+ from collections import defaultdict
10
+
11
+ class KnowledgeGraph:
12
+ """
13
+ Handles the construction and visualization of knowledge graphs
14
+ based on the ontology data.
15
+ """
16
+
17
+ def __init__(self, ontology_manager=None):
18
+ """
19
+ Initialize the knowledge graph handler.
20
+
21
+ Args:
22
+ ontology_manager: Optional ontology manager instance
23
+ """
24
+ self.ontology_manager = ontology_manager
25
+ self.graph = None
26
+
27
+ if ontology_manager:
28
+ self.graph = ontology_manager.graph
29
+
30
+ def build_visualization_graph(
31
+ self,
32
+ include_classes: bool = True,
33
+ include_instances: bool = True,
34
+ central_entity: Optional[str] = None,
35
+ max_distance: int = 2,
36
+ include_properties: bool = False
37
+ ) -> nx.Graph:
38
+ """
39
+ Build a simplified graph for visualization purposes.
40
+
41
+ Args:
42
+ include_classes: Whether to include class nodes
43
+ include_instances: Whether to include instance nodes
44
+ central_entity: Optional central entity to focus the graph on
45
+ max_distance: Maximum distance from central entity to include
46
+ include_properties: Whether to include property nodes
47
+
48
+ Returns:
49
+ A NetworkX graph suitable for visualization
50
+ """
51
+ if not self.graph:
52
+ return nx.Graph()
53
+
54
+ # Create an undirected graph for visualization
55
+ viz_graph = nx.Graph()
56
+
57
+ # If we have a central entity, extract a subgraph around it
58
+ if central_entity and central_entity in self.graph:
59
+ # Get nodes within max_distance of central_entity
60
+ nodes_to_include = set([central_entity])
61
+ current_distance = 0
62
+ current_layer = set([central_entity])
63
+
64
+ while current_distance < max_distance:
65
+ next_layer = set()
66
+ for node in current_layer:
67
+ # Get neighbors
68
+ neighbors = set(self.graph.successors(node)).union(set(self.graph.predecessors(node)))
69
+ next_layer.update(neighbors)
70
+
71
+ nodes_to_include.update(next_layer)
72
+ current_layer = next_layer
73
+ current_distance += 1
74
+
75
+ # Create subgraph
76
+ subgraph = self.graph.subgraph(nodes_to_include)
77
+ else:
78
+ subgraph = self.graph
79
+
80
+ # Add nodes to the visualization graph
81
+ for node, data in subgraph.nodes(data=True):
82
+ node_type = data.get("type")
83
+
84
+ # Skip nodes based on configuration
85
+ if node_type == "class" and not include_classes:
86
+ continue
87
+ if node_type == "instance" and not include_instances:
88
+ continue
89
+
90
+ # Get readable name for the node
91
+ if node_type == "instance" and "properties" in data:
92
+ label = data["properties"].get("name", node)
93
+ else:
94
+ label = node
95
+
96
+ # Set node attributes for visualization
97
+ viz_attrs = {
98
+ "id": node,
99
+ "label": label,
100
+ "title": self._get_node_tooltip(node, data),
101
+ "group": data.get("class_type", node_type),
102
+ "shape": "dot" if node_type == "instance" else "diamond"
103
+ }
104
+
105
+ # Highlight central entity if specified
106
+ if central_entity and node == central_entity:
107
+ viz_attrs["color"] = "#ff7f0e" # Orange for central entity
108
+ viz_attrs["size"] = 25 # Larger size for central entity
109
+
110
+ # Add the node
111
+ viz_graph.add_node(node, **viz_attrs)
112
+
113
+ # Add property nodes if configured
114
+ if include_properties and node_type == "instance" and "properties" in data:
115
+ for prop_name, prop_value in data["properties"].items():
116
+ # Create a property node
117
+ prop_node_id = f"{node}_{prop_name}"
118
+ prop_value_str = str(prop_value)
119
+ if len(prop_value_str) > 20:
120
+ prop_value_str = prop_value_str[:17] + "..."
121
+
122
+ viz_graph.add_node(
123
+ prop_node_id,
124
+ id=prop_node_id,
125
+ label=f"{prop_name}: {prop_value_str}",
126
+ title=f"{prop_name}: {prop_value}",
127
+ group="property",
128
+ shape="ellipse",
129
+ size=5
130
+ )
131
+
132
+ # Connect instance to property
133
+ viz_graph.add_edge(node, prop_node_id, label="has_property", dashes=True)
134
+
135
+ # Add edges to the visualization graph
136
+ for source, target, data in subgraph.edges(data=True):
137
+ # Only include edges between nodes that are in the viz_graph
138
+ if source in viz_graph and target in viz_graph:
139
+ # Skip property-related edges if we're manually creating them
140
+ if include_properties and (
141
+ source.startswith(target + "_") or target.startswith(source + "_")
142
+ ):
143
+ continue
144
+
145
+ # Set edge attributes
146
+ edge_type = data.get("type", "unknown")
147
+
148
+ # Don't show subClassOf and instanceOf relationships if not explicitly requested
149
+ if edge_type in ["subClassOf", "instanceOf"] and not include_classes:
150
+ continue
151
+
152
+ viz_graph.add_edge(source, target, label=edge_type, title=edge_type)
153
+
154
+ return viz_graph
155
+
156
+ def _get_node_tooltip(self, node_id: str, data: Dict) -> str:
157
+ """Generate a tooltip for a node."""
158
+ tooltip = f"<strong>ID:</strong> {node_id}<br>"
159
+
160
+ node_type = data.get("type")
161
+ if node_type:
162
+ tooltip += f"<strong>Type:</strong> {node_type}<br>"
163
+
164
+ if node_type == "instance":
165
+ tooltip += f"<strong>Class:</strong> {data.get('class_type', 'unknown')}<br>"
166
+
167
+ # Add properties
168
+ if "properties" in data:
169
+ tooltip += "<strong>Properties:</strong><br>"
170
+ for key, value in data["properties"].items():
171
+ tooltip += f"- {key}: {value}<br>"
172
+
173
+ elif node_type == "class":
174
+ tooltip += f"<strong>Description:</strong> {data.get('description', '')}<br>"
175
+
176
+ # Add properties if available
177
+ if "properties" in data:
178
+ tooltip += "<strong>Properties:</strong> " + ", ".join(data["properties"]) + "<br>"
179
+
180
+ return tooltip
181
+
182
+ def generate_html_visualization(
183
+ self,
184
+ include_classes: bool = True,
185
+ include_instances: bool = True,
186
+ central_entity: Optional[str] = None,
187
+ max_distance: int = 2,
188
+ include_properties: bool = False,
189
+ height: str = "600px",
190
+ width: str = "100%",
191
+ bgcolor: str = "#ffffff",
192
+ font_color: str = "#000000",
193
+ layout_algorithm: str = "force-directed"
194
+ ) -> str:
195
+ """
196
+ Generate an HTML visualization of the knowledge graph.
197
+
198
+ Args:
199
+ include_classes: Whether to include class nodes
200
+ include_instances: Whether to include instance nodes
201
+ central_entity: Optional central entity to focus the graph on
202
+ max_distance: Maximum distance from central entity to include
203
+ include_properties: Whether to include property nodes
204
+ height: Height of the visualization
205
+ width: Width of the visualization
206
+ bgcolor: Background color
207
+ font_color: Font color
208
+ layout_algorithm: Algorithm for layout ('force-directed', 'hierarchical', 'radial', 'circular')
209
+
210
+ Returns:
211
+ HTML string containing the visualization
212
+ """
213
+ # Build the visualization graph
214
+ viz_graph = self.build_visualization_graph(
215
+ include_classes=include_classes,
216
+ include_instances=include_instances,
217
+ central_entity=central_entity,
218
+ max_distance=max_distance,
219
+ include_properties=include_properties
220
+ )
221
+
222
+ # Create a PyVis network
223
+ net = Network(height=height, width=width, bgcolor=bgcolor, font_color=font_color, directed=True)
224
+
225
+ # Configure physics based on the selected layout algorithm
226
+ if layout_algorithm == "force-directed":
227
+ physics_options = {
228
+ "enabled": True,
229
+ "solver": "forceAtlas2Based",
230
+ "forceAtlas2Based": {
231
+ "gravitationalConstant": -50,
232
+ "centralGravity": 0.01,
233
+ "springLength": 100,
234
+ "springConstant": 0.08
235
+ },
236
+ "stabilization": {
237
+ "enabled": True,
238
+ "iterations": 100
239
+ }
240
+ }
241
+ elif layout_algorithm == "hierarchical":
242
+ physics_options = {
243
+ "enabled": True,
244
+ "hierarchicalRepulsion": {
245
+ "centralGravity": 0.0,
246
+ "springLength": 100,
247
+ "springConstant": 0.01,
248
+ "nodeDistance": 120
249
+ },
250
+ "solver": "hierarchicalRepulsion",
251
+ "stabilization": {
252
+ "enabled": True,
253
+ "iterations": 100
254
+ }
255
+ }
256
+
257
+ # Set hierarchical layout
258
+ net.set_options("""
259
+ var options = {
260
+ "layout": {
261
+ "hierarchical": {
262
+ "enabled": true,
263
+ "direction": "UD",
264
+ "sortMethod": "directed",
265
+ "nodeSpacing": 150,
266
+ "treeSpacing": 200
267
+ }
268
+ }
269
+ }
270
+ """)
271
+ elif layout_algorithm == "radial":
272
+ physics_options = {
273
+ "enabled": True,
274
+ "solver": "repulsion",
275
+ "repulsion": {
276
+ "nodeDistance": 120,
277
+ "centralGravity": 0.2,
278
+ "springLength": 200,
279
+ "springConstant": 0.05
280
+ },
281
+ "stabilization": {
282
+ "enabled": True,
283
+ "iterations": 100
284
+ }
285
+ }
286
+ elif layout_algorithm == "circular":
287
+ physics_options = {
288
+ "enabled": False
289
+ }
290
+
291
+ # Compute circular layout and set fixed positions
292
+ pos = nx.circular_layout(viz_graph)
293
+ for node_id, coords in pos.items():
294
+ if node_id in viz_graph.nodes:
295
+ x, y = coords
296
+ viz_graph.nodes[node_id]['x'] = float(x) * 500
297
+ viz_graph.nodes[node_id]['y'] = float(y) * 500
298
+ viz_graph.nodes[node_id]['physics'] = False
299
+
300
+ # Configure other options
301
+ options = {
302
+ "nodes": {
303
+ "font": {"size": 12},
304
+ "scaling": {"min": 10, "max": 30}
305
+ },
306
+ "edges": {
307
+ "color": {"inherit": True},
308
+ "smooth": {"enabled": True, "type": "dynamic"},
309
+ "arrows": {"to": {"enabled": True, "scaleFactor": 0.5}},
310
+ "font": {"size": 10, "align": "middle"}
311
+ },
312
+ "physics": physics_options,
313
+ "interaction": {
314
+ "hover": True,
315
+ "navigationButtons": True,
316
+ "keyboard": True,
317
+ "tooltipDelay": 100
318
+ }
319
+ }
320
+
321
+ # Set options and create the network
322
+ net.options = options
323
+ net.from_nx(viz_graph)
324
+
325
+ # Add custom CSS for better visualization
326
+ custom_css = """
327
+ <style>
328
+ .vis-network {
329
+ border: 1px solid #ddd;
330
+ border-radius: 5px;
331
+ }
332
+ .vis-tooltip {
333
+ position: absolute;
334
+ background-color: #f5f5f5;
335
+ border: 1px solid #ccc;
336
+ border-radius: 4px;
337
+ padding: 10px;
338
+ font-family: Arial, sans-serif;
339
+ font-size: 12px;
340
+ color: #333;
341
+ max-width: 300px;
342
+ z-index: 9999;
343
+ box-shadow: 0 2px 4px rgba(0,0,0,0.1);
344
+ }
345
+ </style>
346
+ """
347
+
348
+ # Generate the HTML and add custom CSS
349
+ html = net.generate_html()
350
+ html = html.replace("<style>", custom_css + "<style>")
351
+
352
+ # Add legend
353
+ legend_html = self._generate_legend_html(viz_graph)
354
+ html = html.replace("</body>", legend_html + "</body>")
355
+
356
+ return html
357
+
358
+ def _generate_legend_html(self, graph: nx.Graph) -> str:
359
+ """Generate a legend for the visualization."""
360
+ # Collect unique groups
361
+ groups = set()
362
+ for _, attrs in graph.nodes(data=True):
363
+ if "group" in attrs:
364
+ groups.add(attrs["group"])
365
+
366
+ # Generate HTML for legend
367
+ legend_html = """
368
+ <div id="graph-legend" style="position: absolute; top: 10px; right: 10px; background-color: rgba(255,255,255,0.8);
369
+ padding: 10px; border-radius: 5px; border: 1px solid #ddd; max-width: 200px;">
370
+ <strong>Legend:</strong>
371
+ <ul style="list-style-type: none; padding-left: 0; margin-top: 5px;">
372
+ """
373
+
374
+ # Add items for each group
375
+ for group in sorted(groups):
376
+ color = "#97c2fc" # Default color
377
+ if group == "property":
378
+ color = "#ffcc99"
379
+ elif group == "class":
380
+ color = "#a1d3a2"
381
+
382
+ legend_html += f"""
383
+ <li style="margin-bottom: 5px;">
384
+ <span style="display: inline-block; width: 12px; height: 12px; border-radius: 50%;
385
+ background-color: {color}; margin-right: 5px;"></span>
386
+ {group}
387
+ </li>
388
+ """
389
+
390
+ # Close the legend container
391
+ legend_html += """
392
+ </ul>
393
+ <div style="font-size: 10px; margin-top: 5px; color: #666;">
394
+ Double-click to zoom, drag to pan, scroll to zoom in/out
395
+ </div>
396
+ </div>
397
+ """
398
+
399
+ return legend_html
400
+
401
+ def get_graph_statistics(self) -> Dict[str, Any]:
402
+ """
403
+ Calculate statistics about the knowledge graph.
404
+
405
+ Returns:
406
+ A dictionary containing graph statistics
407
+ """
408
+ if not self.graph:
409
+ return {}
410
+
411
+ # Count nodes by type
412
+ class_count = 0
413
+ instance_count = 0
414
+ property_count = 0
415
+
416
+ for _, data in self.graph.nodes(data=True):
417
+ node_type = data.get("type")
418
+ if node_type == "class":
419
+ class_count += 1
420
+ elif node_type == "instance":
421
+ instance_count += 1
422
+ if "properties" in data:
423
+ property_count += len(data["properties"])
424
+
425
+ # Count edges by type
426
+ relationship_counts = {}
427
+ for _, _, data in self.graph.edges(data=True):
428
+ rel_type = data.get("type", "unknown")
429
+ relationship_counts[rel_type] = relationship_counts.get(rel_type, 0) + 1
430
+
431
+ # Calculate graph metrics
432
+ try:
433
+ # Some metrics only work on undirected graphs
434
+ undirected = nx.Graph(self.graph)
435
+ avg_degree = sum(dict(undirected.degree()).values()) / undirected.number_of_nodes()
436
+
437
+ # Only calculate these if the graph is connected
438
+ if nx.is_connected(undirected):
439
+ avg_path_length = nx.average_shortest_path_length(undirected)
440
+ diameter = nx.diameter(undirected)
441
+ else:
442
+ # Get the largest connected component
443
+ largest_cc = max(nx.connected_components(undirected), key=len)
444
+ largest_cc_subgraph = undirected.subgraph(largest_cc)
445
+
446
+ avg_path_length = nx.average_shortest_path_length(largest_cc_subgraph)
447
+ diameter = nx.diameter(largest_cc_subgraph)
448
+
449
+ # Calculate density
450
+ density = nx.density(self.graph)
451
+
452
+ # Calculate clustering coefficient
453
+ clustering = nx.average_clustering(undirected)
454
+ except:
455
+ avg_degree = 0
456
+ avg_path_length = 0
457
+ diameter = 0
458
+ density = 0
459
+ clustering = 0
460
+
461
+ # Count different entity types
462
+ class_counts = defaultdict(int)
463
+ for _, data in self.graph.nodes(data=True):
464
+ if data.get("type") == "instance":
465
+ class_type = data.get("class_type", "unknown")
466
+ class_counts[class_type] += 1
467
+
468
+ # Get nodes with highest centrality
469
+ try:
470
+ betweenness = nx.betweenness_centrality(self.graph)
471
+ degree = nx.degree_centrality(self.graph)
472
+
473
+ # Get top 5 nodes by betweenness centrality
474
+ top_betweenness = sorted(betweenness.items(), key=lambda x: x[1], reverse=True)[:5]
475
+ top_degree = sorted(degree.items(), key=lambda x: x[1], reverse=True)[:5]
476
+
477
+ central_nodes = {
478
+ "betweenness": [{"node": node, "centrality": round(cent, 3)} for node, cent in top_betweenness],
479
+ "degree": [{"node": node, "centrality": round(cent, 3)} for node, cent in top_degree]
480
+ }
481
+ except:
482
+ central_nodes = {}
483
+
484
+ return {
485
+ "node_count": self.graph.number_of_nodes(),
486
+ "edge_count": self.graph.number_of_edges(),
487
+ "class_count": class_count,
488
+ "instance_count": instance_count,
489
+ "property_count": property_count,
490
+ "relationship_counts": relationship_counts,
491
+ "class_instance_counts": dict(class_counts),
492
+ "average_degree": avg_degree,
493
+ "average_path_length": avg_path_length,
494
+ "diameter": diameter,
495
+ "density": density,
496
+ "clustering_coefficient": clustering,
497
+ "central_nodes": central_nodes
498
+ }
499
+
500
+ def find_paths_between_entities(
501
+ self,
502
+ source_entity: str,
503
+ target_entity: str,
504
+ max_length: int = 3
505
+ ) -> List[List[Dict]]:
506
+ """
507
+ Find all paths between two entities up to a maximum length.
508
+
509
+ Args:
510
+ source_entity: Starting entity ID
511
+ target_entity: Target entity ID
512
+ max_length: Maximum path length
513
+
514
+ Returns:
515
+ A list of paths, where each path is a list of edge dictionaries
516
+ """
517
+ if not self.graph or source_entity not in self.graph or target_entity not in self.graph:
518
+ return []
519
+
520
+ # Use networkx to find simple paths
521
+ try:
522
+ simple_paths = list(nx.all_simple_paths(
523
+ self.graph, source_entity, target_entity, cutoff=max_length
524
+ ))
525
+ except (nx.NetworkXNoPath, nx.NodeNotFound):
526
+ return []
527
+
528
+ # Convert paths to edge sequences
529
+ paths = []
530
+ for path in simple_paths:
531
+ edge_sequence = []
532
+ for i in range(len(path) - 1):
533
+ source = path[i]
534
+ target = path[i + 1]
535
+
536
+ # There may be multiple edges between nodes
537
+ edges = self.graph.get_edge_data(source, target)
538
+ if edges:
539
+ for key, data in edges.items():
540
+ edge_sequence.append({
541
+ "source": source,
542
+ "target": target,
543
+ "type": data.get("type", "unknown")
544
+ })
545
+
546
+ # Only include the path if it has meaningful relationships
547
+ # Filter out paths that only contain structural relationships like subClassOf, instanceOf
548
+ meaningful_relationships = [edge for edge in edge_sequence
549
+ if edge["type"] not in ["subClassOf", "instanceOf"]]
550
+
551
+ if meaningful_relationships:
552
+ paths.append(edge_sequence)
553
+
554
+ # Sort paths by length (shorter paths first)
555
+ paths.sort(key=len)
556
+
557
+ return paths
558
+
559
+ def get_entity_neighborhood(
560
+ self,
561
+ entity_id: str,
562
+ max_distance: int = 1,
563
+ include_classes: bool = True
564
+ ) -> Dict[str, Any]:
565
+ """
566
+ Get the neighborhood of an entity.
567
+
568
+ Args:
569
+ entity_id: The central entity ID
570
+ max_distance: Maximum distance from the central entity
571
+ include_classes: Whether to include class relationships
572
+
573
+ Returns:
574
+ A dictionary containing the neighborhood information
575
+ """
576
+ if not self.graph or entity_id not in self.graph:
577
+ return {}
578
+
579
+ # Get nodes within max_distance of entity_id using BFS
580
+ nodes_at_distance = {0: [entity_id]}
581
+ visited = set([entity_id])
582
+
583
+ for distance in range(1, max_distance + 1):
584
+ nodes_at_distance[distance] = []
585
+
586
+ for node in nodes_at_distance[distance - 1]:
587
+ # Get neighbors
588
+ neighbors = list(self.graph.successors(node)) + list(self.graph.predecessors(node))
589
+
590
+ for neighbor in neighbors:
591
+ # Skip class nodes if not including classes
592
+ neighbor_data = self.graph.nodes.get(neighbor, {})
593
+ if not include_classes and neighbor_data.get("type") == "class":
594
+ continue
595
+
596
+ if neighbor not in visited:
597
+ nodes_at_distance[distance].append(neighbor)
598
+ visited.add(neighbor)
599
+
600
+ # Flatten the nodes
601
+ all_nodes = [node for nodes in nodes_at_distance.values() for node in nodes]
602
+
603
+ # Extract the subgraph
604
+ subgraph = self.graph.subgraph(all_nodes)
605
+
606
+ # Build neighbor information
607
+ neighbors = []
608
+ for node in all_nodes:
609
+ if node == entity_id:
610
+ continue
611
+
612
+ node_data = self.graph.nodes[node]
613
+
614
+ # Determine the relations to central entity
615
+ relations = []
616
+
617
+ # Check direct relationships
618
+ # Check if central entity is source
619
+ edges_out = self.graph.get_edge_data(entity_id, node)
620
+ if edges_out:
621
+ for key, data in edges_out.items():
622
+ rel_type = data.get("type", "unknown")
623
+
624
+ # Skip structural relationships if not including classes
625
+ if not include_classes and rel_type in ["subClassOf", "instanceOf"]:
626
+ continue
627
+
628
+ relations.append({
629
+ "type": rel_type,
630
+ "direction": "outgoing"
631
+ })
632
+
633
+ # Check if central entity is target
634
+ edges_in = self.graph.get_edge_data(node, entity_id)
635
+ if edges_in:
636
+ for key, data in edges_in.items():
637
+ rel_type = data.get("type", "unknown")
638
+
639
+ # Skip structural relationships if not including classes
640
+ if not include_classes and rel_type in ["subClassOf", "instanceOf"]:
641
+ continue
642
+
643
+ relations.append({
644
+ "type": rel_type,
645
+ "direction": "incoming"
646
+ })
647
+
648
+ # Also find paths through intermediate nodes (indirect relationships)
649
+ if not relations: # Only look for indirect if no direct relationships
650
+ for path_length in range(2, max_distance + 1):
651
+ try:
652
+ # Find paths of exactly length path_length
653
+ paths = list(nx.all_simple_paths(
654
+ self.graph, entity_id, node, cutoff=path_length, min_edges=path_length
655
+ ))
656
+
657
+ for path in paths:
658
+ if len(path) > 1: # Path should have at least 2 nodes
659
+ intermediate_nodes = path[1:-1] # Skip source and target
660
+
661
+ # Format the path as a relation
662
+ path_relation = {
663
+ "type": "indirect_connection",
664
+ "direction": "outgoing",
665
+ "path_length": len(path) - 1,
666
+ "intermediates": intermediate_nodes
667
+ }
668
+
669
+ relations.append(path_relation)
670
+
671
+ # Only need one example of an indirect path
672
+ break
673
+ except (nx.NetworkXNoPath, nx.NodeNotFound):
674
+ pass
675
+
676
+ # Only include neighbors with relations
677
+ if relations:
678
+ neighbors.append({
679
+ "id": node,
680
+ "type": node_data.get("type"),
681
+ "class_type": node_data.get("class_type"),
682
+ "properties": node_data.get("properties", {}),
683
+ "relations": relations,
684
+ "distance": next(dist for dist, nodes in nodes_at_distance.items() if node in nodes)
685
+ })
686
+
687
+ # Group neighbors by distance
688
+ neighbors_by_distance = defaultdict(list)
689
+ for neighbor in neighbors:
690
+ neighbors_by_distance[neighbor["distance"]].append(neighbor)
691
+
692
+ # Get central entity info
693
+ central_data = self.graph.nodes[entity_id]
694
+
695
+ return {
696
+ "central_entity": {
697
+ "id": entity_id,
698
+ "type": central_data.get("type"),
699
+ "class_type": central_data.get("class_type", ""),
700
+ "properties": central_data.get("properties", {})
701
+ },
702
+ "neighbors": neighbors,
703
+ "neighbors_by_distance": dict(neighbors_by_distance),
704
+ "total_neighbors": len(neighbors)
705
+ }
706
+
707
+ def find_common_patterns(self) -> List[Dict[str, Any]]:
708
+ """
709
+ Find common patterns and structures in the knowledge graph.
710
+
711
+ Returns:
712
+ A list of pattern dictionaries
713
+ """
714
+ if not self.graph:
715
+ return []
716
+
717
+ patterns = []
718
+
719
+ # Find common relationship patterns
720
+ relationship_patterns = self._find_relationship_patterns()
721
+ if relationship_patterns:
722
+ patterns.extend(relationship_patterns)
723
+
724
+ # Find hub entities (entities with many connections)
725
+ hub_entities = self._find_hub_entities()
726
+ if hub_entities:
727
+ patterns.append({
728
+ "type": "hub_entities",
729
+ "description": "Entities with high connectivity serving as knowledge hubs",
730
+ "entities": hub_entities
731
+ })
732
+
733
+ # Find common property patterns
734
+ property_patterns = self._find_property_patterns()
735
+ if property_patterns:
736
+ patterns.extend(property_patterns)
737
+
738
+ return patterns
739
+
740
+ def _find_relationship_patterns(self) -> List[Dict[str, Any]]:
741
+ """Find common relationship patterns in the graph."""
742
+ # Count relationship triplets (source_type, relation, target_type)
743
+ triplet_counts = defaultdict(int)
744
+
745
+ for source, target, data in self.graph.edges(data=True):
746
+ rel_type = data.get("type", "unknown")
747
+
748
+ # Skip structural relationships
749
+ if rel_type in ["subClassOf", "instanceOf"]:
750
+ continue
751
+
752
+ # Get node types
753
+ source_data = self.graph.nodes[source]
754
+ target_data = self.graph.nodes[target]
755
+
756
+ source_type = (
757
+ source_data.get("class_type")
758
+ if source_data.get("type") == "instance"
759
+ else source_data.get("type")
760
+ )
761
+
762
+ target_type = (
763
+ target_data.get("class_type")
764
+ if target_data.get("type") == "instance"
765
+ else target_data.get("type")
766
+ )
767
+
768
+ if source_type and target_type:
769
+ triplet = (source_type, rel_type, target_type)
770
+ triplet_counts[triplet] += 1
771
+
772
+ # Get patterns with significant frequency (more than 1 occurrence)
773
+ patterns = []
774
+ for triplet, count in triplet_counts.items():
775
+ if count > 1:
776
+ source_type, rel_type, target_type = triplet
777
+
778
+ # Find examples of this pattern
779
+ examples = []
780
+ for source, target, data in self.graph.edges(data=True):
781
+ if len(examples) >= 3: # Limit to 3 examples
782
+ break
783
+
784
+ rel = data.get("type", "unknown")
785
+ if rel != rel_type:
786
+ continue
787
+
788
+ source_data = self.graph.nodes[source]
789
+ target_data = self.graph.nodes[target]
790
+
791
+ current_source_type = (
792
+ source_data.get("class_type")
793
+ if source_data.get("type") == "instance"
794
+ else source_data.get("type")
795
+ )
796
+
797
+ current_target_type = (
798
+ target_data.get("class_type")
799
+ if target_data.get("type") == "instance"
800
+ else target_data.get("type")
801
+ )
802
+
803
+ if current_source_type == source_type and current_target_type == target_type:
804
+ # Get readable names if available
805
+ source_name = source
806
+ if source_data.get("type") == "instance" and "properties" in source_data:
807
+ properties = source_data["properties"]
808
+ if "name" in properties:
809
+ source_name = properties["name"]
810
+
811
+ target_name = target
812
+ if target_data.get("type") == "instance" and "properties" in target_data:
813
+ properties = target_data["properties"]
814
+ if "name" in properties:
815
+ target_name = properties["name"]
816
+
817
+ examples.append({
818
+ "source": source,
819
+ "source_name": source_name,
820
+ "target": target,
821
+ "target_name": target_name,
822
+ "relationship": rel_type
823
+ })
824
+
825
+ patterns.append({
826
+ "type": "relationship_pattern",
827
+ "description": f"{source_type} {rel_type} {target_type}",
828
+ "source_type": source_type,
829
+ "relationship": rel_type,
830
+ "target_type": target_type,
831
+ "count": count,
832
+ "examples": examples
833
+ })
834
+
835
+ # Sort by frequency
836
+ patterns.sort(key=lambda x: x["count"], reverse=True)
837
+
838
+ return patterns
839
+
840
+ def _find_hub_entities(self) -> List[Dict[str, Any]]:
841
+ """Find entities that serve as hubs (many connections)."""
842
+ # Calculate degree centrality
843
+ degree = nx.degree_centrality(self.graph)
844
+
845
+ # Get top entities by degree
846
+ top_entities = sorted(degree.items(), key=lambda x: x[1], reverse=True)[:10]
847
+
848
+ hub_entities = []
849
+ for node, centrality in top_entities:
850
+ node_data = self.graph.nodes[node]
851
+ node_type = node_data.get("type")
852
+
853
+ # Only consider instance nodes
854
+ if node_type == "instance":
855
+ # Get class type
856
+ class_type = node_data.get("class_type", "unknown")
857
+
858
+ # Get name if available
859
+ name = node
860
+ if "properties" in node_data and "name" in node_data["properties"]:
861
+ name = node_data["properties"]["name"]
862
+
863
+ # Count relationships by type
864
+ relationships = defaultdict(int)
865
+ for _, _, data in self.graph.edges(data=True, nbunch=[node]):
866
+ rel_type = data.get("type", "unknown")
867
+ if rel_type not in ["subClassOf", "instanceOf"]:
868
+ relationships[rel_type] += 1
869
+
870
+ hub_entities.append({
871
+ "id": node,
872
+ "name": name,
873
+ "type": class_type,
874
+ "centrality": centrality,
875
+ "relationships": dict(relationships),
876
+ "total_connections": sum(relationships.values())
877
+ })
878
+
879
+ # Sort by total connections
880
+ hub_entities.sort(key=lambda x: x["total_connections"], reverse=True)
881
+
882
+ return hub_entities
883
+
884
+ def _find_property_patterns(self) -> List[Dict[str, Any]]:
885
+ """Find common property patterns in instance data."""
886
+ # Track properties by class type
887
+ properties_by_class = defaultdict(lambda: defaultdict(int))
888
+
889
+ for node, data in self.graph.nodes(data=True):
890
+ if data.get("type") == "instance":
891
+ class_type = data.get("class_type", "unknown")
892
+
893
+ if "properties" in data:
894
+ for prop in data["properties"].keys():
895
+ properties_by_class[class_type][prop] += 1
896
+
897
+ # Find common property combinations
898
+ patterns = []
899
+ for class_type, props in properties_by_class.items():
900
+ # Sort properties by frequency
901
+ sorted_props = sorted(props.items(), key=lambda x: x[1], reverse=True)
902
+
903
+ # Only include classes with multiple instances
904
+ class_instances = sum(1 for _, data in self.graph.nodes(data=True)
905
+ if data.get("type") == "instance" and data.get("class_type") == class_type)
906
+
907
+ if class_instances > 1:
908
+ common_props = [prop for prop, count in sorted_props if count > 1]
909
+
910
+ if common_props:
911
+ patterns.append({
912
+ "type": "property_pattern",
913
+ "description": f"Common properties for {class_type} instances",
914
+ "class_type": class_type,
915
+ "instance_count": class_instances,
916
+ "common_properties": common_props,
917
+ "property_frequencies": {prop: count for prop, count in sorted_props}
918
+ })
919
+
920
+ return patterns
src/ontology_manager.py ADDED
@@ -0,0 +1,440 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # src/ontology_manager.py
2
+
3
+ import json
4
+ import networkx as nx
5
+ from typing import Dict, List, Any, Optional, Union, Set
6
+
7
+ class OntologyManager:
8
+ """
9
+ Manages the ontology model and provides methods for querying and navigating
10
+ the ontological structure.
11
+ """
12
+
13
+ def __init__(self, ontology_path: str):
14
+ """
15
+ Initialize the ontology manager with a path to the ontology JSON file.
16
+
17
+ Args:
18
+ ontology_path: Path to the JSON file containing the ontology model
19
+ """
20
+ self.ontology_path = ontology_path
21
+ self.ontology_data = self._load_ontology()
22
+ self.graph = self._build_graph()
23
+
24
+ def _load_ontology(self) -> Dict:
25
+ """Load the ontology from the JSON file."""
26
+ with open(self.ontology_path, 'r') as f:
27
+ return json.load(f)
28
+
29
+ def _build_graph(self) -> nx.MultiDiGraph:
30
+ """Construct a directed graph from the ontology data."""
31
+ G = nx.MultiDiGraph()
32
+
33
+ # Add class nodes
34
+ for class_id, class_data in self.ontology_data["classes"].items():
35
+ G.add_node(class_id,
36
+ type="class",
37
+ description=class_data.get("description", ""),
38
+ properties=class_data.get("properties", []))
39
+
40
+ # Add subclass relationships
41
+ if "subClassOf" in class_data:
42
+ G.add_edge(class_id, class_data["subClassOf"],
43
+ type="subClassOf")
44
+
45
+ # Add relationship type information
46
+ self.relationship_info = {r["name"]: r for r in self.ontology_data["relationships"]}
47
+
48
+ # Add instance nodes and their relationships
49
+ for instance in self.ontology_data["instances"]:
50
+ G.add_node(instance["id"],
51
+ type="instance",
52
+ class_type=instance["type"],
53
+ properties=instance.get("properties", {}))
54
+
55
+ # Add instance-of-class relationship
56
+ G.add_edge(instance["id"], instance["type"], type="instanceOf")
57
+
58
+ # Add relationships between instances
59
+ for rel in instance.get("relationships", []):
60
+ G.add_edge(instance["id"], rel["target"],
61
+ type=rel["type"])
62
+
63
+ return G
64
+
65
+ def get_classes(self) -> List[str]:
66
+ """Return a list of all class names in the ontology."""
67
+ return list(self.ontology_data["classes"].keys())
68
+
69
+ def get_class_hierarchy(self) -> Dict[str, List[str]]:
70
+ """Return a dictionary mapping each class to its subclasses."""
71
+ hierarchy = {}
72
+ for class_id in self.get_classes():
73
+ hierarchy[class_id] = []
74
+
75
+ for class_id, class_data in self.ontology_data["classes"].items():
76
+ if "subClassOf" in class_data:
77
+ parent = class_data["subClassOf"]
78
+ if parent in hierarchy:
79
+ hierarchy[parent].append(class_id)
80
+
81
+ return hierarchy
82
+
83
+ def get_instances_of_class(self, class_name: str, include_subclasses: bool = True) -> List[str]:
84
+ """
85
+ Get all instances of a given class.
86
+
87
+ Args:
88
+ class_name: The name of the class
89
+ include_subclasses: Whether to include instances of subclasses
90
+
91
+ Returns:
92
+ A list of instance IDs
93
+ """
94
+ if include_subclasses:
95
+ # Get all subclasses recursively
96
+ subclasses = set(self._get_all_subclasses(class_name))
97
+ subclasses.add(class_name)
98
+
99
+ # Get instances of all classes
100
+ instances = []
101
+ for class_id in subclasses:
102
+ instances.extend([
103
+ n for n, attr in self.graph.nodes(data=True)
104
+ if attr.get("type") == "instance" and attr.get("class_type") == class_id
105
+ ])
106
+ return instances
107
+ else:
108
+ # Just get direct instances
109
+ return [
110
+ n for n, attr in self.graph.nodes(data=True)
111
+ if attr.get("type") == "instance" and attr.get("class_type") == class_name
112
+ ]
113
+
114
+ def _get_all_subclasses(self, class_name: str) -> List[str]:
115
+ """Recursively get all subclasses of a given class."""
116
+ subclasses = []
117
+ direct_subclasses = [
118
+ src for src, dst, data in self.graph.edges(data=True)
119
+ if dst == class_name and data.get("type") == "subClassOf"
120
+ ]
121
+
122
+ for subclass in direct_subclasses:
123
+ subclasses.append(subclass)
124
+ subclasses.extend(self._get_all_subclasses(subclass))
125
+
126
+ return subclasses
127
+
128
+ def get_relationships(self, entity_id: str, relationship_type: Optional[str] = None) -> List[Dict]:
129
+ """
130
+ Get all relationships for a given entity, optionally filtered by type.
131
+
132
+ Args:
133
+ entity_id: The ID of the entity
134
+ relationship_type: Optional relationship type to filter by
135
+
136
+ Returns:
137
+ A list of dictionaries containing relationship information
138
+ """
139
+ relationships = []
140
+
141
+ # Look at outgoing edges
142
+ for _, target, data in self.graph.out_edges(entity_id, data=True):
143
+ rel_type = data.get("type")
144
+ if rel_type != "instanceOf" and rel_type != "subClassOf":
145
+ if relationship_type is None or rel_type == relationship_type:
146
+ relationships.append({
147
+ "type": rel_type,
148
+ "target": target,
149
+ "direction": "outgoing"
150
+ })
151
+
152
+ # Look at incoming edges
153
+ for source, _, data in self.graph.in_edges(entity_id, data=True):
154
+ rel_type = data.get("type")
155
+ if rel_type != "instanceOf" and rel_type != "subClassOf":
156
+ if relationship_type is None or rel_type == relationship_type:
157
+ relationships.append({
158
+ "type": rel_type,
159
+ "source": source,
160
+ "direction": "incoming"
161
+ })
162
+
163
+ return relationships
164
+
165
+ def find_paths(self, source_id: str, target_id: str, max_length: int = 3) -> List[List[Dict]]:
166
+ """
167
+ Find all paths between two entities up to a maximum length.
168
+
169
+ Args:
170
+ source_id: Starting entity ID
171
+ target_id: Target entity ID
172
+ max_length: Maximum path length
173
+
174
+ Returns:
175
+ A list of paths, where each path is a list of relationship dictionaries
176
+ """
177
+ paths = []
178
+
179
+ # Use networkx to find simple paths
180
+ simple_paths = nx.all_simple_paths(self.graph, source_id, target_id, cutoff=max_length)
181
+
182
+ for path in simple_paths:
183
+ path_with_edges = []
184
+ for i in range(len(path) - 1):
185
+ source = path[i]
186
+ target = path[i + 1]
187
+ # There may be multiple edges between nodes
188
+ edges = self.graph.get_edge_data(source, target)
189
+ if edges:
190
+ for key, data in edges.items():
191
+ path_with_edges.append({
192
+ "source": source,
193
+ "target": target,
194
+ "type": data.get("type", "unknown")
195
+ })
196
+ paths.append(path_with_edges)
197
+
198
+ return paths
199
+
200
+ def get_entity_info(self, entity_id: str) -> Dict:
201
+ """
202
+ Get detailed information about an entity.
203
+
204
+ Args:
205
+ entity_id: The ID of the entity
206
+
207
+ Returns:
208
+ A dictionary with entity information
209
+ """
210
+ if entity_id not in self.graph:
211
+ return {}
212
+
213
+ node_data = self.graph.nodes[entity_id]
214
+ entity_type = node_data.get("type")
215
+
216
+ if entity_type == "instance":
217
+ # Get class information
218
+ class_type = node_data.get("class_type")
219
+ class_info = self.ontology_data["classes"].get(class_type, {})
220
+
221
+ return {
222
+ "id": entity_id,
223
+ "type": entity_type,
224
+ "class": class_type,
225
+ "class_description": class_info.get("description", ""),
226
+ "properties": node_data.get("properties", {}),
227
+ "relationships": self.get_relationships(entity_id)
228
+ }
229
+ elif entity_type == "class":
230
+ return {
231
+ "id": entity_id,
232
+ "type": entity_type,
233
+ "description": node_data.get("description", ""),
234
+ "properties": node_data.get("properties", []),
235
+ "subclasses": self._get_all_subclasses(entity_id),
236
+ "instances": self.get_instances_of_class(entity_id)
237
+ }
238
+
239
+ return node_data
240
+
241
+ def get_text_representation(self) -> str:
242
+ """
243
+ Generate a text representation of the ontology for embedding.
244
+
245
+ Returns:
246
+ A string containing the textual representation of the ontology
247
+ """
248
+ text_chunks = []
249
+
250
+ # Class definitions
251
+ for class_id, class_data in self.ontology_data["classes"].items():
252
+ chunk = f"Class: {class_id}\n"
253
+ chunk += f"Description: {class_data.get('description', '')}\n"
254
+
255
+ if "subClassOf" in class_data:
256
+ chunk += f"{class_id} is a subclass of {class_data['subClassOf']}.\n"
257
+
258
+ if "properties" in class_data:
259
+ chunk += f"{class_id} has properties: {', '.join(class_data['properties'])}.\n"
260
+
261
+ text_chunks.append(chunk)
262
+
263
+ # Relationship definitions
264
+ for rel in self.ontology_data["relationships"]:
265
+ chunk = f"Relationship: {rel['name']}\n"
266
+ chunk += f"Domain: {rel['domain']}, Range: {rel['range']}\n"
267
+ chunk += f"Description: {rel.get('description', '')}\n"
268
+ chunk += f"Cardinality: {rel.get('cardinality', 'many-to-many')}\n"
269
+
270
+ if "inverse" in rel:
271
+ chunk += f"The inverse relationship is {rel['inverse']}.\n"
272
+
273
+ text_chunks.append(chunk)
274
+
275
+ # Rules
276
+ for rule in self.ontology_data.get("rules", []):
277
+ chunk = f"Rule: {rule.get('id', '')}\n"
278
+ chunk += f"Description: {rule.get('description', '')}\n"
279
+ text_chunks.append(chunk)
280
+
281
+ # Instance data
282
+ for instance in self.ontology_data["instances"]:
283
+ chunk = f"Instance: {instance['id']}\n"
284
+ chunk += f"Type: {instance['type']}\n"
285
+
286
+ # Properties
287
+ if "properties" in instance:
288
+ props = []
289
+ for key, value in instance["properties"].items():
290
+ if isinstance(value, list):
291
+ props.append(f"{key}: {', '.join(str(v) for v in value)}")
292
+ else:
293
+ props.append(f"{key}: {value}")
294
+
295
+ if props:
296
+ chunk += "Properties:\n- " + "\n- ".join(props) + "\n"
297
+
298
+ # Relationships
299
+ if "relationships" in instance:
300
+ rels = []
301
+ for rel in instance["relationships"]:
302
+ rels.append(f"{rel['type']} {rel['target']}")
303
+
304
+ if rels:
305
+ chunk += "Relationships:\n- " + "\n- ".join(rels) + "\n"
306
+
307
+ text_chunks.append(chunk)
308
+
309
+ return "\n\n".join(text_chunks)
310
+
311
+ def query_by_relationship(self, source_type: str, relationship: str, target_type: str) -> List[Dict]:
312
+ """
313
+ Query for instances connected by a specific relationship.
314
+
315
+ Args:
316
+ source_type: Type of the source entity
317
+ relationship: Type of relationship
318
+ target_type: Type of the target entity
319
+
320
+ Returns:
321
+ A list of matching relationship dictionaries
322
+ """
323
+ results = []
324
+
325
+ # Get all instances of the source type
326
+ source_instances = self.get_instances_of_class(source_type)
327
+
328
+ for source_id in source_instances:
329
+ # Get relationships of the specified type
330
+ relationships = self.get_relationships(source_id, relationship)
331
+
332
+ for rel in relationships:
333
+ if rel["direction"] == "outgoing" and "target" in rel:
334
+ target_id = rel["target"]
335
+ target_data = self.graph.nodes[target_id]
336
+
337
+ # Check if the target is of the right type
338
+ if (target_data.get("type") == "instance" and
339
+ target_data.get("class_type") == target_type):
340
+ results.append({
341
+ "source": source_id,
342
+ "source_properties": self.graph.nodes[source_id].get("properties", {}),
343
+ "relationship": relationship,
344
+ "target": target_id,
345
+ "target_properties": target_data.get("properties", {})
346
+ })
347
+
348
+ return results
349
+
350
+ def get_semantic_context(self, query: str) -> List[str]:
351
+ """
352
+ Retrieve relevant semantic context from the ontology based on a query.
353
+
354
+ This method identifies entities and relationships mentioned in the query
355
+ and returns contextual information about them from the ontology.
356
+
357
+ Args:
358
+ query: The query string to analyze
359
+
360
+ Returns:
361
+ A list of text chunks providing relevant ontological context
362
+ """
363
+ # This is a simple implementation - a more sophisticated one would use
364
+ # entity recognition and semantic parsing
365
+
366
+ query_lower = query.lower()
367
+ context_chunks = []
368
+
369
+ # Check for class mentions
370
+ for class_id in self.get_classes():
371
+ if class_id.lower() in query_lower:
372
+ # Add class information
373
+ class_data = self.ontology_data["classes"][class_id]
374
+ chunk = f"Class {class_id}: {class_data.get('description', '')}\n"
375
+
376
+ # Add subclass information
377
+ if "subClassOf" in class_data:
378
+ parent = class_data["subClassOf"]
379
+ chunk += f"{class_id} is a subclass of {parent}.\n"
380
+
381
+ # Add property information
382
+ if "properties" in class_data:
383
+ chunk += f"{class_id} has properties: {', '.join(class_data['properties'])}.\n"
384
+
385
+ context_chunks.append(chunk)
386
+
387
+ # Also add some instance examples
388
+ instances = self.get_instances_of_class(class_id, include_subclasses=False)[:3]
389
+ if instances:
390
+ instance_chunk = f"Examples of {class_id}:\n"
391
+ for inst_id in instances:
392
+ props = self.graph.nodes[inst_id].get("properties", {})
393
+ if "name" in props:
394
+ instance_chunk += f"- {inst_id} ({props['name']})\n"
395
+ else:
396
+ instance_chunk += f"- {inst_id}\n"
397
+ context_chunks.append(instance_chunk)
398
+
399
+ # Check for relationship mentions
400
+ for rel in self.ontology_data["relationships"]:
401
+ if rel["name"].lower() in query_lower:
402
+ chunk = f"Relationship {rel['name']}: {rel.get('description', '')}\n"
403
+ chunk += f"This relationship connects {rel['domain']} to {rel['range']}.\n"
404
+
405
+ # Add examples
406
+ examples = self.query_by_relationship(rel['domain'], rel['name'], rel['range'])[:3]
407
+ if examples:
408
+ chunk += "Examples:\n"
409
+ for ex in examples:
410
+ source_props = ex["source_properties"]
411
+ target_props = ex["target_properties"]
412
+
413
+ source_name = source_props.get("name", ex["source"])
414
+ target_name = target_props.get("name", ex["target"])
415
+
416
+ chunk += f"- {source_name} {rel['name']} {target_name}\n"
417
+
418
+ context_chunks.append(chunk)
419
+
420
+ # If we found nothing specific, add general ontology info
421
+ if not context_chunks:
422
+ # Add information about top-level classes
423
+ top_classes = [c for c, data in self.ontology_data["classes"].items()
424
+ if "subClassOf" not in data or data["subClassOf"] == "Entity"]
425
+
426
+ if top_classes:
427
+ chunk = "Main classes in the ontology:\n"
428
+ for cls in top_classes:
429
+ desc = self.ontology_data["classes"][cls].get("description", "")
430
+ chunk += f"- {cls}: {desc}\n"
431
+ context_chunks.append(chunk)
432
+
433
+ # Add information about key relationships
434
+ if self.ontology_data["relationships"]:
435
+ chunk = "Key relationships in the ontology:\n"
436
+ for rel in self.ontology_data["relationships"][:5]: # Top 5 relationships
437
+ chunk += f"- {rel['name']}: {rel.get('description', '')}\n"
438
+ context_chunks.append(chunk)
439
+
440
+ return context_chunks
src/semantic_retriever.py ADDED
@@ -0,0 +1,233 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # src/semantic_retriever.py
2
+
3
+ from typing import List, Dict, Any, Tuple, Optional
4
+ import numpy as np
5
+ from langchain_community.embeddings import OpenAIEmbeddings
6
+ from langchain_community.vectorstores import FAISS
7
+ from langchain.schema import Document
8
+ from src.ontology_manager import OntologyManager
9
+
10
+ class SemanticRetriever:
11
+ """
12
+ Enhanced retrieval system that combines vector search with ontology awareness.
13
+ """
14
+
15
+ def __init__(
16
+ self,
17
+ ontology_manager: OntologyManager,
18
+ embeddings_model = None,
19
+ text_chunks: Optional[List[str]] = None
20
+ ):
21
+ """
22
+ Initialize the semantic retriever.
23
+
24
+ Args:
25
+ ontology_manager: The ontology manager instance
26
+ embeddings_model: The embeddings model to use (defaults to OpenAIEmbeddings)
27
+ text_chunks: Optional list of text chunks to add to the vector store
28
+ """
29
+ self.ontology_manager = ontology_manager
30
+ self.embeddings = embeddings_model or OpenAIEmbeddings()
31
+
32
+ # Create a vector store with the text representation of the ontology
33
+ ontology_text = ontology_manager.get_text_representation()
34
+ self.ontology_chunks = self._split_text(ontology_text)
35
+
36
+ # Add additional text chunks if provided
37
+ if text_chunks:
38
+ self.text_chunks = text_chunks
39
+ all_chunks = self.ontology_chunks + text_chunks
40
+ else:
41
+ self.text_chunks = []
42
+ all_chunks = self.ontology_chunks
43
+
44
+ # Convert to Document objects for FAISS
45
+ documents = [Document(page_content=chunk, metadata={"source": "ontology" if i < len(self.ontology_chunks) else "text"})
46
+ for i, chunk in enumerate(all_chunks)]
47
+
48
+ # Create the vector store
49
+ self.vector_store = FAISS.from_documents(documents, self.embeddings)
50
+
51
+ def _split_text(self, text: str, chunk_size: int = 500, overlap: int = 50) -> List[str]:
52
+ """Split text into chunks for embedding."""
53
+ chunks = []
54
+ text_length = len(text)
55
+
56
+ for i in range(0, text_length, chunk_size - overlap):
57
+ chunk = text[i:i + chunk_size]
58
+ if len(chunk) < 50: # Skip very small chunks
59
+ continue
60
+ chunks.append(chunk)
61
+
62
+ return chunks
63
+
64
+ def retrieve(self, query: str, k: int = 4, include_ontology_context: bool = True) -> List[Document]:
65
+ """
66
+ Retrieve relevant documents using a hybrid approach.
67
+
68
+ Args:
69
+ query: The query string
70
+ k: Number of documents to retrieve
71
+ include_ontology_context: Whether to include additional ontology context
72
+
73
+ Returns:
74
+ A list of retrieved documents
75
+ """
76
+ # Get semantic context from the ontology
77
+ if include_ontology_context:
78
+ ontology_context = self.ontology_manager.get_semantic_context(query)
79
+ else:
80
+ ontology_context = []
81
+
82
+ # Perform vector similarity search
83
+ vector_results = self.vector_store.similarity_search(query, k=k)
84
+
85
+ # Combine results
86
+ combined_results = vector_results
87
+
88
+ # Add ontology context as additional documents
89
+ for i, context in enumerate(ontology_context):
90
+ combined_results.append(Document(
91
+ page_content=context,
92
+ metadata={"source": "ontology_context", "context_id": i}
93
+ ))
94
+
95
+ return combined_results
96
+
97
+ def retrieve_with_paths(self, query: str, k: int = 4) -> Dict[str, Any]:
98
+ """
99
+ Enhanced retrieval that includes semantic paths between entities.
100
+
101
+ Args:
102
+ query: The query string
103
+ k: Number of documents to retrieve
104
+
105
+ Returns:
106
+ A dictionary containing retrieved documents and semantic paths
107
+ """
108
+ # Basic retrieval
109
+ basic_results = self.retrieve(query, k)
110
+
111
+ # Extract potential entities from the query (simplified approach)
112
+ # A more sophisticated approach would use NER or entity linking
113
+ entity_types = ["Product", "Department", "Employee", "Manager", "Customer", "Feedback"]
114
+ query_words = query.lower().split()
115
+
116
+ potential_entities = []
117
+ for entity_type in entity_types:
118
+ if entity_type.lower() in query_words:
119
+ # Get instances of this type
120
+ instances = self.ontology_manager.get_instances_of_class(entity_type)
121
+ if instances:
122
+ # Just take the first few for demonstration
123
+ potential_entities.extend(instances[:2])
124
+
125
+ # Find paths between potential entities
126
+ paths = []
127
+ if len(potential_entities) >= 2:
128
+ for i in range(len(potential_entities)):
129
+ for j in range(i+1, len(potential_entities)):
130
+ source = potential_entities[i]
131
+ target = potential_entities[j]
132
+
133
+ # Find paths between these entities
134
+ entity_paths = self.ontology_manager.find_paths(source, target, max_length=3)
135
+
136
+ if entity_paths:
137
+ for path in entity_paths:
138
+ # Convert path to text
139
+ path_text = self._path_to_text(path)
140
+ paths.append({
141
+ "source": source,
142
+ "target": target,
143
+ "path": path,
144
+ "text": path_text
145
+ })
146
+
147
+ # Convert paths to documents
148
+ path_documents = []
149
+ for i, path_info in enumerate(paths):
150
+ path_documents.append(Document(
151
+ page_content=path_info["text"],
152
+ metadata={
153
+ "source": "semantic_path",
154
+ "path_id": i,
155
+ "source_entity": path_info["source"],
156
+ "target_entity": path_info["target"]
157
+ }
158
+ ))
159
+
160
+ return {
161
+ "documents": basic_results + path_documents,
162
+ "paths": paths
163
+ }
164
+
165
+ def _path_to_text(self, path: List[Dict]) -> str:
166
+ """Convert a path to a text description."""
167
+ if not path:
168
+ return ""
169
+
170
+ text_parts = []
171
+ for edge in path:
172
+ source = edge["source"]
173
+ target = edge["target"]
174
+ relation = edge["type"]
175
+
176
+ # Get entity information
177
+ source_info = self.ontology_manager.get_entity_info(source)
178
+ target_info = self.ontology_manager.get_entity_info(target)
179
+
180
+ # Get names if available
181
+ source_name = source
182
+ if "properties" in source_info and "name" in source_info["properties"]:
183
+ source_name = source_info["properties"]["name"]
184
+
185
+ target_name = target
186
+ if "properties" in target_info and "name" in target_info["properties"]:
187
+ target_name = target_info["properties"]["name"]
188
+
189
+ # Describe the relationship
190
+ text_parts.append(f"{source_name} {relation} {target_name}")
191
+
192
+ return " -> ".join(text_parts)
193
+
194
+ def search_by_property(self, class_type: str, property_name: str, property_value: str) -> List[Document]:
195
+ """
196
+ Search for instances of a class with a specific property value.
197
+
198
+ Args:
199
+ class_type: The class to search in
200
+ property_name: The property name to match
201
+ property_value: The property value to match
202
+
203
+ Returns:
204
+ A list of matched entities as documents
205
+ """
206
+ instances = self.ontology_manager.get_instances_of_class(class_type)
207
+
208
+ results = []
209
+ for instance_id in instances:
210
+ entity_info = self.ontology_manager.get_entity_info(instance_id)
211
+ if "properties" in entity_info:
212
+ properties = entity_info["properties"]
213
+ if property_name in properties:
214
+ # Simple string matching (could be enhanced with fuzzy matching)
215
+ if str(properties[property_name]).lower() == property_value.lower():
216
+ # Convert to document
217
+ doc_content = f"Instance: {instance_id}\n"
218
+ doc_content += f"Type: {class_type}\n"
219
+ doc_content += "Properties:\n"
220
+
221
+ for prop_name, prop_value in properties.items():
222
+ doc_content += f"- {prop_name}: {prop_value}\n"
223
+
224
+ results.append(Document(
225
+ page_content=doc_content,
226
+ metadata={
227
+ "source": "property_search",
228
+ "instance_id": instance_id,
229
+ "class_type": class_type
230
+ }
231
+ ))
232
+
233
+ return results
src/visualization.py ADDED
@@ -0,0 +1,1564 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # src/visualization.py
2
+
3
+ import streamlit as st
4
+ import json
5
+ import networkx as nx
6
+ import pandas as pd
7
+ from typing import Dict, List, Any, Optional, Set, Tuple
8
+ import plotly.graph_objects as go
9
+ import plotly.express as px
10
+ import matplotlib.pyplot as plt
11
+ import matplotlib.colors as mcolors
12
+ from collections import defaultdict
13
+ import math
14
+
15
+ def render_html_in_streamlit(html_content: str):
16
+ """Display HTML content in Streamlit using an iframe."""
17
+ import base64
18
+
19
+ # Encode the HTML content
20
+ encoded_html = base64.b64encode(html_content.encode()).decode()
21
+
22
+ # Create an iframe with the data URL
23
+ iframe_html = f"""
24
+ <iframe
25
+ srcdoc="{encoded_html}"
26
+ width="100%"
27
+ height="600px"
28
+ frameborder="0"
29
+ allowfullscreen>
30
+ </iframe>
31
+ """
32
+
33
+ # Display the iframe
34
+ st.markdown(iframe_html, unsafe_allow_html=True)
35
+
36
+
37
+ def display_ontology_stats(ontology_manager):
38
+ """Display statistics and visualizations about the ontology."""
39
+ st.subheader("📊 Ontology Structure and Statistics")
40
+
41
+ # Get basic stats
42
+ classes = ontology_manager.get_classes()
43
+ class_hierarchy = ontology_manager.get_class_hierarchy()
44
+
45
+ # Count instances per class
46
+ class_counts = []
47
+ for class_name in classes:
48
+ instance_count = len(ontology_manager.get_instances_of_class(class_name, include_subclasses=False))
49
+ class_counts.append({
50
+ "Class": class_name,
51
+ "Instances": instance_count
52
+ })
53
+
54
+ # Display summary metrics
55
+ col1, col2, col3 = st.columns(3)
56
+
57
+ with col1:
58
+ st.metric("Total Classes", len(classes))
59
+
60
+ # Count total instances
61
+ total_instances = sum(item["Instances"] for item in class_counts)
62
+ with col2:
63
+ st.metric("Total Instances", total_instances)
64
+
65
+ # Count relationships
66
+ relationship_count = len(ontology_manager.ontology_data.get("relationships", []))
67
+ with col3:
68
+ st.metric("Relationship Types", relationship_count)
69
+
70
+ # Visualize class hierarchy
71
+ st.markdown("### Class Hierarchy")
72
+
73
+ # Create tabs for different views
74
+ tab1, tab2, tab3 = st.tabs(["Tree View", "Class Statistics", "Hierarchy Graph"])
75
+
76
+ with tab1:
77
+ # Create a collapsible tree view of class hierarchy
78
+ display_class_hierarchy_tree(ontology_manager, class_hierarchy)
79
+
80
+ with tab2:
81
+ # Display class stats and distribution
82
+ if class_counts:
83
+ # Filter to only show classes with instances
84
+ non_empty_classes = [item for item in class_counts if item["Instances"] > 0]
85
+
86
+ if non_empty_classes:
87
+ df = pd.DataFrame(non_empty_classes)
88
+ df = df.sort_values("Instances", ascending=False)
89
+
90
+ # Create horizontal bar chart
91
+ fig = px.bar(df,
92
+ x="Instances",
93
+ y="Class",
94
+ orientation='h',
95
+ title="Instances per Class",
96
+ color="Instances",
97
+ color_continuous_scale="viridis")
98
+
99
+ fig.update_layout(yaxis={'categoryorder':'total ascending'})
100
+ st.plotly_chart(fig, use_container_width=True)
101
+ else:
102
+ st.info("No classes with instances found.")
103
+
104
+ # Show distribution of classes by inheritance depth
105
+ display_class_depth_distribution(ontology_manager)
106
+
107
+ with tab3:
108
+ # Display class hierarchy as a graph
109
+ display_class_hierarchy_graph(ontology_manager)
110
+
111
+ # Relationship statistics
112
+ st.markdown("### Relationship Analysis")
113
+
114
+ # Get relationship usage statistics
115
+ relationship_usage = analyze_relationship_usage(ontology_manager)
116
+
117
+ # Display relationship usage in a table and chart
118
+ if relationship_usage:
119
+ tab1, tab2 = st.tabs(["Usage Statistics", "Domain/Range Distribution"])
120
+
121
+ with tab1:
122
+ # Create DataFrame for the table
123
+ df = pd.DataFrame(relationship_usage)
124
+ df = df.sort_values("Usage Count", ascending=False)
125
+
126
+ # Show table
127
+ st.dataframe(df)
128
+
129
+ # Create bar chart for relationship usage
130
+ fig = px.bar(df,
131
+ x="Relationship",
132
+ y="Usage Count",
133
+ title="Relationship Usage Frequency",
134
+ color="Usage Count",
135
+ color_continuous_scale="blues")
136
+
137
+ st.plotly_chart(fig, use_container_width=True)
138
+
139
+ with tab2:
140
+ # Display domain-range distribution
141
+ display_domain_range_distribution(ontology_manager)
142
+
143
+
144
+ def display_class_hierarchy_tree(ontology_manager, class_hierarchy):
145
+ """Display class hierarchy as an interactive tree."""
146
+ # Find root classes (those that aren't subclasses of anything else)
147
+ all_subclasses = set()
148
+ for subclasses in class_hierarchy.values():
149
+ all_subclasses.update(subclasses)
150
+
151
+ root_classes = [cls for cls in ontology_manager.get_classes() if cls not in all_subclasses]
152
+
153
+ # Create a recursive function to display the hierarchy
154
+ def display_subclasses(class_name, indent=0):
155
+ # Get class info
156
+ class_info = ontology_manager.ontology_data["classes"].get(class_name, {})
157
+ description = class_info.get("description", "")
158
+ instance_count = len(ontology_manager.get_instances_of_class(class_name, include_subclasses=False))
159
+
160
+ # Display class with expander for subclasses
161
+ if indent == 0:
162
+ # Root level classes are always expanded
163
+ with st.expander(f"📁 {class_name} ({instance_count} instances)", expanded=True):
164
+ st.markdown(f"**Description:** {description}")
165
+
166
+ # Show properties if any
167
+ properties = class_info.get("properties", [])
168
+ if properties:
169
+ st.markdown("**Properties:**")
170
+ st.markdown(", ".join(properties))
171
+
172
+ # Display subclasses
173
+ subclasses = class_hierarchy.get(class_name, [])
174
+ if subclasses:
175
+ st.markdown("**Subclasses:**")
176
+ for subclass in sorted(subclasses):
177
+ display_subclasses(subclass, indent + 1)
178
+ else:
179
+ st.markdown("*No subclasses*")
180
+ else:
181
+ # Nested classes use indentation and only show direct instances
182
+ if instance_count > 0:
183
+ class_label = f"📁 {class_name} ({instance_count} instances)"
184
+ else:
185
+ class_label = f"📁 {class_name}"
186
+
187
+ with st.expander(class_label, expanded=False):
188
+ st.markdown(f"**Description:** {description}")
189
+
190
+ # Show properties if any
191
+ properties = class_info.get("properties", [])
192
+ if properties:
193
+ st.markdown("**Properties:**")
194
+ st.markdown(", ".join(properties))
195
+
196
+ # Display subclasses
197
+ subclasses = class_hierarchy.get(class_name, [])
198
+ if subclasses:
199
+ st.markdown("**Subclasses:**")
200
+ for subclass in sorted(subclasses):
201
+ display_subclasses(subclass, indent + 1)
202
+ else:
203
+ st.markdown("*No subclasses*")
204
+
205
+ # Display each root class
206
+ for root_class in sorted(root_classes):
207
+ display_subclasses(root_class)
208
+
209
+
210
+ def get_class_depths(ontology_manager) -> Dict[str, int]:
211
+ """Calculate the inheritance depth of each class."""
212
+ depths = {}
213
+ class_data = ontology_manager.ontology_data["classes"]
214
+
215
+ def get_depth(class_name):
216
+ # If we've already calculated the depth, return it
217
+ if class_name in depths:
218
+ return depths[class_name]
219
+
220
+ # Get the class data
221
+ cls = class_data.get(class_name, {})
222
+
223
+ # If no parent, depth is 0
224
+ if "subClassOf" not in cls:
225
+ depths[class_name] = 0
226
+ return 0
227
+
228
+ # Otherwise, depth is 1 + parent's depth
229
+ parent = cls["subClassOf"]
230
+ parent_depth = get_depth(parent)
231
+ depths[class_name] = parent_depth + 1
232
+ return depths[class_name]
233
+
234
+ # Calculate depths for all classes
235
+ for class_name in class_data:
236
+ get_depth(class_name)
237
+
238
+ return depths
239
+
240
+
241
+ def display_class_depth_distribution(ontology_manager):
242
+ """Display distribution of classes by inheritance depth."""
243
+ depths = get_class_depths(ontology_manager)
244
+
245
+ # Count classes at each depth
246
+ depth_counts = defaultdict(int)
247
+ for _, depth in depths.items():
248
+ depth_counts[depth] += 1
249
+
250
+ # Create dataframe
251
+ df = pd.DataFrame([
252
+ {"Depth": depth, "Count": count}
253
+ for depth, count in depth_counts.items()
254
+ ])
255
+
256
+ if not df.empty:
257
+ df = df.sort_values("Depth")
258
+
259
+ # Create bar chart
260
+ fig = px.bar(df,
261
+ x="Depth",
262
+ y="Count",
263
+ title="Class Distribution by Inheritance Depth",
264
+ labels={"Depth": "Inheritance Depth", "Count": "Number of Classes"},
265
+ color="Count",
266
+ text="Count")
267
+
268
+ fig.update_traces(texttemplate='%{text}', textposition='outside')
269
+ fig.update_layout(uniformtext_minsize=8, uniformtext_mode='hide')
270
+
271
+ st.plotly_chart(fig, use_container_width=True)
272
+
273
+
274
+ def display_class_hierarchy_graph(ontology_manager):
275
+ """Display class hierarchy as a directed graph."""
276
+ # Create a directed graph
277
+ G = nx.DiGraph()
278
+
279
+ # Add nodes for each class
280
+ for class_name, class_info in ontology_manager.ontology_data["classes"].items():
281
+ # Count direct instances
282
+ instance_count = len(ontology_manager.get_instances_of_class(class_name, include_subclasses=False))
283
+
284
+ # Add node with attributes
285
+ G.add_node(class_name,
286
+ type="class",
287
+ description=class_info.get("description", ""),
288
+ instance_count=instance_count)
289
+
290
+ # Add edge for subclass relationship
291
+ if "subClassOf" in class_info:
292
+ parent = class_info["subClassOf"]
293
+ G.add_edge(parent, class_name, relationship="subClassOf")
294
+
295
+ # Create a Plotly graph visualization
296
+ # Calculate node positions using a hierarchical layout
297
+ pos = nx.nx_agraph.graphviz_layout(G, prog="dot")
298
+
299
+ # Convert positions to lists for Plotly
300
+ node_x = []
301
+ node_y = []
302
+ node_text = []
303
+ node_size = []
304
+ node_color = []
305
+
306
+ for node in G.nodes():
307
+ x, y = pos[node]
308
+ node_x.append(x)
309
+ node_y.append(y)
310
+
311
+ # Get node info for hover text
312
+ description = G.nodes[node].get("description", "")
313
+ instance_count = G.nodes[node].get("instance_count", 0)
314
+
315
+ # Prepare hover text
316
+ hover_text = f"Class: {node}<br>Description: {description}<br>Instances: {instance_count}"
317
+ node_text.append(hover_text)
318
+
319
+ # Size nodes by instance count (with a minimum size)
320
+ size = 10 + (instance_count * 2)
321
+ size = min(40, max(15, size)) # Limit size range
322
+ node_size.append(size)
323
+
324
+ # Color nodes by depth
325
+ depth = get_class_depths(ontology_manager).get(node, 0)
326
+ # Use a color scale from light to dark blue
327
+ node_color.append(depth)
328
+
329
+ # Create edge traces
330
+ edge_x = []
331
+ edge_y = []
332
+
333
+ for edge in G.edges():
334
+ x0, y0 = pos[edge[0]]
335
+ x1, y1 = pos[edge[1]]
336
+
337
+ # Add a curved line with multiple points
338
+ edge_x.append(x0)
339
+ edge_x.append(x1)
340
+ edge_x.append(None) # Add None to create a break between edges
341
+
342
+ edge_y.append(y0)
343
+ edge_y.append(y1)
344
+ edge_y.append(None)
345
+
346
+ # Create node trace
347
+ node_trace = go.Scatter(
348
+ x=node_x, y=node_y,
349
+ mode='markers+text',
350
+ text=[node for node in G.nodes()],
351
+ textposition="bottom center",
352
+ hoverinfo='text',
353
+ hovertext=node_text,
354
+ marker=dict(
355
+ showscale=True,
356
+ colorscale='Blues',
357
+ color=node_color,
358
+ size=node_size,
359
+ line=dict(width=2, color='DarkSlateGrey'),
360
+ colorbar=dict(
361
+ title="Depth",
362
+ thickness=15,
363
+ tickvals=[0, max(node_color)],
364
+ ticktext=["Root", f"Depth {max(node_color)}"]
365
+ )
366
+ )
367
+ )
368
+
369
+ # Create edge trace
370
+ edge_trace = go.Scatter(
371
+ x=edge_x, y=edge_y,
372
+ line=dict(width=1, color='#888'),
373
+ hoverinfo='none',
374
+ mode='lines'
375
+ )
376
+
377
+ # Create figure
378
+ fig = go.Figure(data=[edge_trace, node_trace],
379
+ layout=go.Layout(
380
+ showlegend=False,
381
+ hovermode='closest',
382
+ margin=dict(b=20, l=5, r=5, t=40),
383
+ xaxis=dict(showgrid=False, zeroline=False, showticklabels=False),
384
+ yaxis=dict(showgrid=False, zeroline=False, showticklabels=False),
385
+ title="Class Hierarchy Graph",
386
+ title_x=0.5
387
+ ))
388
+
389
+ # Display the figure
390
+ st.plotly_chart(fig, use_container_width=True)
391
+
392
+
393
+ def analyze_relationship_usage(ontology_manager) -> List[Dict]:
394
+ """Analyze how relationships are used in the ontology."""
395
+ relationship_data = ontology_manager.ontology_data.get("relationships", [])
396
+ instances = ontology_manager.ontology_data.get("instances", [])
397
+
398
+ # Initialize counters
399
+ usage_counts = defaultdict(int)
400
+
401
+ # Count relationship usage in instances
402
+ for instance in instances:
403
+ for rel in instance.get("relationships", []):
404
+ usage_counts[rel["type"]] += 1
405
+
406
+ # Prepare results
407
+ results = []
408
+ for rel in relationship_data:
409
+ rel_name = rel["name"]
410
+ domain = rel["domain"]
411
+ range_class = rel["range"]
412
+ cardinality = rel.get("cardinality", "many-to-many")
413
+ count = usage_counts.get(rel_name, 0)
414
+
415
+ results.append({
416
+ "Relationship": rel_name,
417
+ "Domain": domain,
418
+ "Range": range_class,
419
+ "Cardinality": cardinality,
420
+ "Usage Count": count
421
+ })
422
+
423
+ return results
424
+
425
+
426
+ def display_domain_range_distribution(ontology_manager):
427
+ """Display domain and range distribution for relationships."""
428
+ relationship_data = ontology_manager.ontology_data.get("relationships", [])
429
+
430
+ # Count domains and ranges
431
+ domain_counts = defaultdict(int)
432
+ range_counts = defaultdict(int)
433
+
434
+ for rel in relationship_data:
435
+ domain_counts[rel["domain"]] += 1
436
+ range_counts[rel["range"]] += 1
437
+
438
+ # Create DataFrames
439
+ domain_df = pd.DataFrame([
440
+ {"Class": cls, "Count": count, "Type": "Domain"}
441
+ for cls, count in domain_counts.items()
442
+ ])
443
+
444
+ range_df = pd.DataFrame([
445
+ {"Class": cls, "Count": count, "Type": "Range"}
446
+ for cls, count in range_counts.items()
447
+ ])
448
+
449
+ # Combine
450
+ combined_df = pd.concat([domain_df, range_df])
451
+
452
+ # Create plot
453
+ if not combined_df.empty:
454
+ fig = px.bar(combined_df,
455
+ x="Class",
456
+ y="Count",
457
+ color="Type",
458
+ barmode="group",
459
+ title="Classes as Domain vs Range in Relationships",
460
+ color_discrete_map={"Domain": "#1f77b4", "Range": "#ff7f0e"})
461
+
462
+ fig.update_layout(xaxis={'categoryorder':'total descending'})
463
+
464
+ st.plotly_chart(fig, use_container_width=True)
465
+
466
+
467
+ def display_entity_details(entity_info: Dict[str, Any], ontology_manager):
468
+ """Display detailed information about an entity."""
469
+ if not entity_info:
470
+ st.warning("Entity not found.")
471
+ return
472
+
473
+ st.subheader(f"📝 Entity: {entity_info['id']}")
474
+
475
+ # Determine entity type and get class hierarchy
476
+ entity_type = entity_info.get("type", "")
477
+ class_type = entity_info.get("class", entity_info.get("class_type", ""))
478
+
479
+ class_hierarchy = []
480
+ if class_type:
481
+ current_class = class_type
482
+ while current_class:
483
+ class_hierarchy.append(current_class)
484
+ parent_class = ontology_manager.ontology_data["classes"].get(current_class, {}).get("subClassOf", "")
485
+ if not parent_class or parent_class == current_class: # Prevent infinite loops
486
+ break
487
+ current_class = parent_class
488
+
489
+ # Display entity metadata
490
+ col1, col2 = st.columns([1, 2])
491
+
492
+ with col1:
493
+ st.markdown("### Basic Information")
494
+
495
+ # Basic info metrics
496
+ st.metric("Entity Type", entity_type)
497
+
498
+ if class_type:
499
+ st.metric("Class", class_type)
500
+
501
+ # Display class hierarchy
502
+ if class_hierarchy and len(class_hierarchy) > 1:
503
+ st.markdown("**Class Hierarchy:**")
504
+ hierarchy_str = " → ".join(reversed(class_hierarchy))
505
+ st.markdown(f"```\n{hierarchy_str}\n```")
506
+
507
+ with col2:
508
+ # Display class description if available
509
+ if "class_description" in entity_info:
510
+ st.markdown("### Description")
511
+ st.markdown(entity_info.get("class_description", "No description available."))
512
+
513
+ # Properties
514
+ if "properties" in entity_info and entity_info["properties"]:
515
+ st.markdown("### Properties")
516
+
517
+ # Create a more structured property display
518
+ properties = []
519
+ for key, value in entity_info["properties"].items():
520
+ # Handle different value types
521
+ if isinstance(value, list):
522
+ value_str = ", ".join(str(v) for v in value)
523
+ else:
524
+ value_str = str(value)
525
+
526
+ properties.append({"Property": key, "Value": value_str})
527
+
528
+ # Display as table with highlighting
529
+ property_df = pd.DataFrame(properties)
530
+ st.dataframe(
531
+ property_df,
532
+ column_config={
533
+ "Property": st.column_config.TextColumn("Property", width="medium"),
534
+ "Value": st.column_config.TextColumn("Value", width="large")
535
+ },
536
+ hide_index=True
537
+ )
538
+
539
+ # Relationships with visual enhancements
540
+ if "relationships" in entity_info and entity_info["relationships"]:
541
+ st.markdown("### Relationships")
542
+
543
+ # Group relationships by direction
544
+ outgoing = []
545
+ incoming = []
546
+
547
+ for rel in entity_info["relationships"]:
548
+ if "direction" in rel and rel["direction"] == "outgoing":
549
+ outgoing.append({
550
+ "Relationship": rel["type"],
551
+ "Direction": "→",
552
+ "Related Entity": rel["target"]
553
+ })
554
+ elif "direction" in rel and rel["direction"] == "incoming":
555
+ incoming.append({
556
+ "Relationship": rel["type"],
557
+ "Direction": "←",
558
+ "Related Entity": rel["source"]
559
+ })
560
+
561
+ # Create tabs for outgoing and incoming
562
+ if outgoing or incoming:
563
+ tab1, tab2 = st.tabs(["Outgoing Relationships", "Incoming Relationships"])
564
+
565
+ with tab1:
566
+ if outgoing:
567
+ st.dataframe(
568
+ pd.DataFrame(outgoing),
569
+ column_config={
570
+ "Relationship": st.column_config.TextColumn("Relationship Type", width="medium"),
571
+ "Direction": st.column_config.TextColumn("Direction", width="small"),
572
+ "Related Entity": st.column_config.TextColumn("Target Entity", width="medium")
573
+ },
574
+ hide_index=True
575
+ )
576
+ else:
577
+ st.info("No outgoing relationships.")
578
+
579
+ with tab2:
580
+ if incoming:
581
+ st.dataframe(
582
+ pd.DataFrame(incoming),
583
+ column_config={
584
+ "Relationship": st.column_config.TextColumn("Relationship Type", width="medium"),
585
+ "Direction": st.column_config.TextColumn("Direction", width="small"),
586
+ "Related Entity": st.column_config.TextColumn("Source Entity", width="medium")
587
+ },
588
+ hide_index=True
589
+ )
590
+ else:
591
+ st.info("No incoming relationships.")
592
+
593
+ # Visual relationship graph
594
+ st.markdown("#### Relationship Graph")
595
+ display_entity_relationship_graph(entity_info, ontology_manager)
596
+
597
+
598
+ def display_entity_relationship_graph(entity_info: Dict[str, Any], ontology_manager):
599
+ """Display a graph of an entity's relationships."""
600
+ entity_id = entity_info["id"]
601
+
602
+ # Create graph
603
+ G = nx.DiGraph()
604
+
605
+ # Add central entity
606
+ G.add_node(entity_id, type="central")
607
+
608
+ # Add related entities and relationships
609
+ for rel in entity_info.get("relationships", []):
610
+ if "direction" in rel and rel["direction"] == "outgoing":
611
+ target = rel["target"]
612
+ rel_type = rel["type"]
613
+
614
+ # Add target node if not exists
615
+ if target not in G:
616
+ target_info = ontology_manager.get_entity_info(target)
617
+ node_type = target_info.get("type", "unknown")
618
+ G.add_node(target, type=node_type)
619
+
620
+ # Add edge
621
+ G.add_edge(entity_id, target, type=rel_type)
622
+
623
+ elif "direction" in rel and rel["direction"] == "incoming":
624
+ source = rel["source"]
625
+ rel_type = rel["type"]
626
+
627
+ # Add source node if not exists
628
+ if source not in G:
629
+ source_info = ontology_manager.get_entity_info(source)
630
+ node_type = source_info.get("type", "unknown")
631
+ G.add_node(source, type=node_type)
632
+
633
+ # Add edge
634
+ G.add_edge(source, entity_id, type=rel_type)
635
+
636
+ # Use a force-directed layout
637
+ pos = nx.spring_layout(G, k=0.5, iterations=50)
638
+
639
+ # Create Plotly figure
640
+ fig = go.Figure()
641
+
642
+ # Add edges with curved lines
643
+ for source, target, data in G.edges(data=True):
644
+ x0, y0 = pos[source]
645
+ x1, y1 = pos[target]
646
+ rel_type = data.get("type", "unknown")
647
+
648
+ # Calculate edge midpoint for label
649
+ mid_x = (x0 + x1) / 2
650
+ mid_y = (y0 + y1) / 2
651
+
652
+ # Draw edge
653
+ fig.add_trace(go.Scatter(
654
+ x=[x0, x1],
655
+ y=[y0, y1],
656
+ mode="lines",
657
+ line=dict(width=1, color="#888"),
658
+ hoverinfo="text",
659
+ hovertext=f"Relationship: {rel_type}",
660
+ showlegend=False
661
+ ))
662
+
663
+ # Add relationship label
664
+ fig.add_trace(go.Scatter(
665
+ x=[mid_x],
666
+ y=[mid_y],
667
+ mode="text",
668
+ text=[rel_type],
669
+ textposition="middle center",
670
+ textfont=dict(size=10, color="#555"),
671
+ hoverinfo="none",
672
+ showlegend=False
673
+ ))
674
+
675
+ # Add nodes with different colors by type
676
+ node_groups = defaultdict(list)
677
+
678
+ for node, data in G.nodes(data=True):
679
+ node_type = data.get("type", "unknown")
680
+ node_info = ontology_manager.get_entity_info(node)
681
+
682
+ # Get friendly name if available
683
+ name = node
684
+ if "properties" in node_info and "name" in node_info["properties"]:
685
+ name = node_info["properties"]["name"]
686
+
687
+ node_groups[node_type].append({
688
+ "id": node,
689
+ "name": name,
690
+ "x": pos[node][0],
691
+ "y": pos[node][1],
692
+ "info": node_info
693
+ })
694
+
695
+ # Define colors for different node types
696
+ colors = {
697
+ "central": "#ff7f0e", # Highlighted color for central entity
698
+ "instance": "#1f77b4",
699
+ "class": "#2ca02c",
700
+ "unknown": "#d62728"
701
+ }
702
+
703
+ # Add each node group with appropriate styling
704
+ for node_type, nodes in node_groups.items():
705
+ # Default to unknown color if type not in map
706
+ color = colors.get(node_type, colors["unknown"])
707
+
708
+ x = [node["x"] for node in nodes]
709
+ y = [node["y"] for node in nodes]
710
+ text = [node["name"] for node in nodes]
711
+
712
+ # Prepare hover text
713
+ hover_text = []
714
+ for node in nodes:
715
+ info = node["info"]
716
+ hover = f"ID: {node['id']}<br>Name: {node['name']}"
717
+
718
+ if "class_type" in info:
719
+ hover += f"<br>Type: {info['class_type']}"
720
+
721
+ hover_text.append(hover)
722
+
723
+ # Adjust size for central entity
724
+ size = 20 if node_type == "central" else 15
725
+
726
+ fig.add_trace(go.Scatter(
727
+ x=x,
728
+ y=y,
729
+ mode="markers+text",
730
+ marker=dict(
731
+ size=size,
732
+ color=color,
733
+ line=dict(width=2, color="white")
734
+ ),
735
+ text=text,
736
+ textposition="bottom center",
737
+ hoverinfo="text",
738
+ hovertext=hover_text,
739
+ name=node_type.capitalize()
740
+ ))
741
+
742
+ # Update layout
743
+ fig.update_layout(
744
+ title=f"Relationships for {entity_id}",
745
+ title_x=0.5,
746
+ showlegend=True,
747
+ hovermode="closest",
748
+ margin=dict(b=20, l=5, r=5, t=40),
749
+ xaxis=dict(showgrid=False, zeroline=False, showticklabels=False),
750
+ yaxis=dict(showgrid=False, zeroline=False, showticklabels=False),
751
+ height=500
752
+ )
753
+
754
+ st.plotly_chart(fig, use_container_width=True)
755
+
756
+
757
+ def display_graph_visualization(knowledge_graph, central_entity=None, max_distance=2):
758
+ """Display an interactive visualization of the knowledge graph."""
759
+ st.subheader("🕸️ Knowledge Graph Visualization")
760
+
761
+ # Controls for the visualization
762
+ with st.expander("Visualization Settings", expanded=True):
763
+ col1, col2, col3 = st.columns(3)
764
+
765
+ with col1:
766
+ include_classes = st.checkbox("Include Classes", value=True)
767
+
768
+ with col2:
769
+ include_instances = st.checkbox("Include Instances", value=True)
770
+
771
+ with col3:
772
+ include_properties = st.checkbox("Include Properties", value=False)
773
+
774
+ st.markdown("---")
775
+
776
+ col1, col2 = st.columns(2)
777
+
778
+ with col1:
779
+ max_distance = st.slider("Max Relationship Distance", 1, 5, max_distance)
780
+
781
+ with col2:
782
+ layout_algorithm = st.selectbox(
783
+ "Layout Algorithm",
784
+ ["Force-Directed", "Hierarchical", "Radial", "Circular"],
785
+ index=0
786
+ )
787
+
788
+ # Generate HTML visualization
789
+ html = knowledge_graph.generate_html_visualization(
790
+ include_classes=include_classes,
791
+ include_instances=include_instances,
792
+ central_entity=central_entity,
793
+ max_distance=max_distance,
794
+ include_properties=include_properties,
795
+ layout_algorithm=layout_algorithm.lower()
796
+ )
797
+
798
+ # Render the HTML
799
+ render_html_in_streamlit(html)
800
+
801
+ # Entity filter
802
+ with st.expander("Focus on Entity", expanded=central_entity is not None):
803
+ # Get all entities
804
+ entities = []
805
+ for class_name in knowledge_graph.ontology_manager.get_classes():
806
+ entities.extend(knowledge_graph.ontology_manager.get_instances_of_class(class_name))
807
+
808
+ # Deduplicate
809
+ entities = sorted(set(entities))
810
+
811
+ # Select entity
812
+ selected_entity = st.selectbox(
813
+ "Select Entity to Focus On",
814
+ ["None"] + entities,
815
+ index=0 if central_entity is None else entities.index(central_entity) + 1
816
+ )
817
+
818
+ if selected_entity != "None":
819
+ st.button("Focus Graph", on_click=lambda: st.experimental_rerun())
820
+
821
+ # Display graph statistics
822
+ stats = knowledge_graph.get_graph_statistics()
823
+ if stats:
824
+ st.markdown("### Graph Statistics")
825
+
826
+ col1, col2, col3, col4 = st.columns(4)
827
+ col1.metric("Nodes", stats.get("node_count", 0))
828
+ col2.metric("Edges", stats.get("edge_count", 0))
829
+ col3.metric("Classes", stats.get("class_count", 0))
830
+ col4.metric("Instances", stats.get("instance_count", 0))
831
+
832
+ # Display relationship counts
833
+ if "relationship_counts" in stats:
834
+ rel_counts = stats["relationship_counts"]
835
+ rel_data = [{"Relationship": rel, "Count": count} for rel, count in rel_counts.items()
836
+ if rel not in ["subClassOf", "instanceOf"]] # Filter out structural relationships
837
+
838
+ if rel_data:
839
+ df = pd.DataFrame(rel_data)
840
+ fig = px.bar(df,
841
+ x="Relationship",
842
+ y="Count",
843
+ title="Relationship Distribution",
844
+ color="Count",
845
+ color_continuous_scale="viridis")
846
+
847
+ st.plotly_chart(fig, use_container_width=True)
848
+
849
+ def visualize_path(path_info, ontology_manager):
850
+ """Visualize a semantic path between entities with enhanced graphics and details."""
851
+ if not path_info or "path" not in path_info:
852
+ st.warning("No path information available.")
853
+ return
854
+
855
+ st.subheader("🔄 Semantic Path Visualization")
856
+
857
+ path = path_info["path"]
858
+
859
+ # Get entity information for each node in the path
860
+ entities = {}
861
+ all_nodes = set()
862
+
863
+ # Add source and target
864
+ if "source" in path_info:
865
+ source_id = path_info["source"]
866
+ all_nodes.add(source_id)
867
+ entities[source_id] = ontology_manager.get_entity_info(source_id)
868
+
869
+ if "target" in path_info:
870
+ target_id = path_info["target"]
871
+ all_nodes.add(target_id)
872
+ entities[target_id] = ontology_manager.get_entity_info(target_id)
873
+
874
+ # Add all entities in the path
875
+ for edge in path:
876
+ source_id = edge["source"]
877
+ target_id = edge["target"]
878
+ all_nodes.add(source_id)
879
+ all_nodes.add(target_id)
880
+
881
+ if source_id not in entities:
882
+ entities[source_id] = ontology_manager.get_entity_info(source_id)
883
+
884
+ if target_id not in entities:
885
+ entities[target_id] = ontology_manager.get_entity_info(target_id)
886
+
887
+ # Create tabs for different views
888
+ tab1, tab2, tab3 = st.tabs(["Path Visualization", "Entity Details", "Path Summary"])
889
+
890
+ with tab1:
891
+ # Display path as a sequence diagram
892
+ display_path_visualization(path, entities)
893
+
894
+ with tab2:
895
+ # Display details of entities in the path
896
+ st.markdown("### Entities in Path")
897
+
898
+ # Group entities by type
899
+ entities_by_type = defaultdict(list)
900
+ for entity_id in all_nodes:
901
+ entity_info = entities.get(entity_id, {})
902
+ entity_type = entity_info.get("class_type", entity_info.get("class", "Unknown"))
903
+ entities_by_type[entity_type].append((entity_id, entity_info))
904
+
905
+ # Create an expander for each entity type
906
+ for entity_type, entity_list in entities_by_type.items():
907
+ with st.expander(f"{entity_type} ({len(entity_list)})", expanded=True):
908
+ for entity_id, entity_info in entity_list:
909
+ st.markdown(f"**{entity_id}**")
910
+
911
+ # Display properties if available
912
+ if "properties" in entity_info and entity_info["properties"]:
913
+ props_markdown = ", ".join([f"**{k}**: {v}" for k, v in entity_info["properties"].items()])
914
+ st.markdown(props_markdown)
915
+
916
+ st.markdown("---")
917
+
918
+ with tab3:
919
+ # Display textual summary of the path
920
+ st.markdown("### Path Description")
921
+
922
+ # If path_info has text, use it
923
+ if "text" in path_info and path_info["text"]:
924
+ st.markdown(f"**Path:** {path_info['text']}")
925
+ else:
926
+ # Otherwise, generate a description
927
+ path_steps = []
928
+ for edge in path:
929
+ source_id = edge["source"]
930
+ target_id = edge["target"]
931
+ relation = edge["type"]
932
+
933
+ # Get readable names if available
934
+ source_name = source_id
935
+ target_name = target_id
936
+
937
+ if source_id in entities and "properties" in entities[source_id]:
938
+ props = entities[source_id]["properties"]
939
+ if "name" in props:
940
+ source_name = props["name"]
941
+
942
+ if target_id in entities and "properties" in entities[target_id]:
943
+ props = entities[target_id]["properties"]
944
+ if "name" in props:
945
+ target_name = props["name"]
946
+
947
+ path_steps.append(f"{source_name} **{relation}** {target_name}")
948
+
949
+ st.markdown(" → ".join(path_steps))
950
+
951
+ # Display relevant business rules
952
+ relevant_rules = find_relevant_rules_for_path(path, ontology_manager)
953
+ if relevant_rules:
954
+ st.markdown("### Relevant Business Rules")
955
+ for rule in relevant_rules:
956
+ st.markdown(f"- **{rule['id']}**: {rule['description']}")
957
+
958
+
959
+ def display_path_visualization(path, entities):
960
+ """Create an enhanced visual representation of the path."""
961
+ if not path:
962
+ st.info("Path is empty.")
963
+ return
964
+
965
+ # Create nodes and positions
966
+ nodes = []
967
+ x_positions = {}
968
+
969
+ # Collect all unique nodes in the path
970
+ unique_nodes = set()
971
+ for edge in path:
972
+ unique_nodes.add(edge["source"])
973
+ unique_nodes.add(edge["target"])
974
+
975
+ # Create ordered list of nodes
976
+ path_nodes = []
977
+ if path:
978
+ # Start with the first source
979
+ current_node = path[0]["source"]
980
+ path_nodes.append(current_node)
981
+
982
+ # Follow the path
983
+ for edge in path:
984
+ target = edge["target"]
985
+ path_nodes.append(target)
986
+ current_node = target
987
+ else:
988
+ # If no path, just use the unique nodes
989
+ path_nodes = list(unique_nodes)
990
+
991
+ # Assign positions along a line
992
+ for i, node_id in enumerate(path_nodes):
993
+ x_positions[node_id] = i
994
+
995
+ # Get node info
996
+ entity_info = entities.get(node_id, {})
997
+ properties = entity_info.get("properties", {})
998
+ entity_type = entity_info.get("class_type", entity_info.get("class", "Unknown"))
999
+
1000
+ # Get display name
1001
+ name = properties.get("name", node_id)
1002
+
1003
+ nodes.append({
1004
+ "id": node_id,
1005
+ "name": name,
1006
+ "type": entity_type,
1007
+ "properties": properties
1008
+ })
1009
+
1010
+ # Create Plotly figure for horizontal path
1011
+ fig = go.Figure()
1012
+
1013
+ # Add nodes
1014
+ node_x = []
1015
+ node_y = []
1016
+ node_text = []
1017
+ node_hover = []
1018
+ node_colors = []
1019
+
1020
+ # Color mapping for entity types
1021
+ color_map = {}
1022
+ for node in nodes:
1023
+ node_type = node["type"]
1024
+ if node_type not in color_map:
1025
+ # Assign colors from a categorical colorscale
1026
+ idx = len(color_map) % len(px.colors.qualitative.Plotly)
1027
+ color_map[node_type] = px.colors.qualitative.Plotly[idx]
1028
+
1029
+ for node in nodes:
1030
+ node_x.append(x_positions[node["id"]])
1031
+ node_y.append(0) # All nodes at y=0 for a horizontal path
1032
+ node_text.append(node["name"])
1033
+
1034
+ # Create detailed hover text
1035
+ hover = f"{node['id']}<br>{node['type']}"
1036
+ for k, v in node["properties"].items():
1037
+ hover += f"<br>{k}: {v}"
1038
+ node_hover.append(hover)
1039
+
1040
+ # Set node color by type
1041
+ node_colors.append(color_map.get(node["type"], "#7f7f7f"))
1042
+
1043
+ # Add node trace
1044
+ fig.add_trace(go.Scatter(
1045
+ x=node_x,
1046
+ y=node_y,
1047
+ mode="markers+text",
1048
+ marker=dict(
1049
+ size=30,
1050
+ color=node_colors,
1051
+ line=dict(width=2, color="DarkSlateGrey")
1052
+ ),
1053
+ text=node_text,
1054
+ textposition="bottom center",
1055
+ hovertext=node_hover,
1056
+ hoverinfo="text",
1057
+ name="Entities"
1058
+ ))
1059
+
1060
+ # Add edges with relationship labels
1061
+ for edge in path:
1062
+ source = edge["source"]
1063
+ target = edge["target"]
1064
+ edge_type = edge["type"]
1065
+
1066
+ source_pos = x_positions[source]
1067
+ target_pos = x_positions[target]
1068
+
1069
+ # Add edge line
1070
+ fig.add_trace(go.Scatter(
1071
+ x=[source_pos, target_pos],
1072
+ y=[0, 0],
1073
+ mode="lines",
1074
+ line=dict(width=2, color="#888"),
1075
+ hoverinfo="none",
1076
+ showlegend=False
1077
+ ))
1078
+
1079
+ # Add relationship label above the line
1080
+ fig.add_trace(go.Scatter(
1081
+ x=[(source_pos + target_pos) / 2],
1082
+ y=[0.1], # Slightly above the line
1083
+ mode="text",
1084
+ text=[edge_type],
1085
+ textposition="top center",
1086
+ hoverinfo="none",
1087
+ showlegend=False
1088
+ ))
1089
+
1090
+ # Update layout
1091
+ fig.update_layout(
1092
+ title="Path Visualization",
1093
+ showlegend=False,
1094
+ hovermode="closest",
1095
+ margin=dict(b=40, l=20, r=20, t=40),
1096
+ xaxis=dict(showgrid=False, zeroline=False, showticklabels=False),
1097
+ yaxis=dict(showgrid=False, zeroline=False, showticklabels=False),
1098
+ height=300,
1099
+ plot_bgcolor="white"
1100
+ )
1101
+
1102
+ # Add a legend for entity types
1103
+ for entity_type, color in color_map.items():
1104
+ fig.add_trace(go.Scatter(
1105
+ x=[None],
1106
+ y=[None],
1107
+ mode="markers",
1108
+ marker=dict(size=10, color=color),
1109
+ name=entity_type,
1110
+ showlegend=True
1111
+ ))
1112
+
1113
+ fig.update_layout(legend=dict(
1114
+ orientation="h",
1115
+ yanchor="bottom",
1116
+ y=-0.3,
1117
+ xanchor="center",
1118
+ x=0.5
1119
+ ))
1120
+
1121
+ st.plotly_chart(fig, use_container_width=True)
1122
+
1123
+ # Add step-by-step description
1124
+ st.markdown("### Step-by-Step Path")
1125
+ for i, edge in enumerate(path):
1126
+ source = edge["source"]
1127
+ target = edge["target"]
1128
+ relation = edge["type"]
1129
+
1130
+ # Get display names
1131
+ source_info = entities.get(source, {})
1132
+ target_info = entities.get(target, {})
1133
+
1134
+ source_name = source
1135
+ if "properties" in source_info and "name" in source_info["properties"]:
1136
+ source_name = source_info["properties"]["name"]
1137
+
1138
+ target_name = target
1139
+ if "properties" in target_info and "name" in target_info["properties"]:
1140
+ target_name = target_info["properties"]["name"]
1141
+
1142
+ st.markdown(f"**Step {i+1}:** {source_name} ({source}) **{relation}** {target_name} ({target})")
1143
+
1144
+
1145
+ def find_relevant_rules_for_path(path, ontology_manager):
1146
+ """Find business rules relevant to the entities and relationships in a path."""
1147
+ rules = ontology_manager.ontology_data.get("rules", [])
1148
+ if not rules:
1149
+ return []
1150
+
1151
+ # Extract entities and relationships from the path
1152
+ entity_types = set()
1153
+ relationship_types = set()
1154
+
1155
+ for edge in path:
1156
+ source = edge["source"]
1157
+ target = edge["target"]
1158
+ relation = edge["type"]
1159
+
1160
+ # Get entity info
1161
+ source_info = ontology_manager.get_entity_info(source)
1162
+ target_info = ontology_manager.get_entity_info(target)
1163
+
1164
+ # Add entity types
1165
+ if "class_type" in source_info:
1166
+ entity_types.add(source_info["class_type"])
1167
+
1168
+ if "class_type" in target_info:
1169
+ entity_types.add(target_info["class_type"])
1170
+
1171
+ # Add relationship type
1172
+ relationship_types.add(relation)
1173
+
1174
+ # Find rules that mention these entities or relationships
1175
+ relevant_rules = []
1176
+
1177
+ for rule in rules:
1178
+ rule_text = json.dumps(rule).lower()
1179
+
1180
+ # Check if rule mentions any of the entity types or relationships
1181
+ is_relevant = False
1182
+
1183
+ for entity_type in entity_types:
1184
+ if entity_type.lower() in rule_text:
1185
+ is_relevant = True
1186
+ break
1187
+
1188
+ if not is_relevant:
1189
+ for rel_type in relationship_types:
1190
+ if rel_type.lower() in rule_text:
1191
+ is_relevant = True
1192
+ break
1193
+
1194
+ if is_relevant:
1195
+ relevant_rules.append(rule)
1196
+
1197
+ return relevant_rules
1198
+
1199
+
1200
+ def display_reasoning_trace(query: str, retrieved_docs: List[Dict], answer: str, ontology_manager):
1201
+ """Display an enhanced trace of how ontological reasoning was used to answer the query."""
1202
+ st.subheader("🧠 Ontology-Enhanced Reasoning")
1203
+
1204
+ # Create a multi-tab interface for different aspects of reasoning
1205
+ tab1, tab2, tab3 = st.tabs(["Query Analysis", "Knowledge Retrieval", "Reasoning Path"])
1206
+
1207
+ with tab1:
1208
+ # Extract entity and relationship mentions with confidence
1209
+ entity_mentions, relationship_mentions = analyze_query_ontology_concepts(query, ontology_manager)
1210
+
1211
+ # Display detected entities with confidence scores
1212
+ if entity_mentions:
1213
+ st.markdown("### Entities Detected in Query")
1214
+
1215
+ # Convert to DataFrame for visualization
1216
+ entity_df = pd.DataFrame([{
1217
+ "Entity Type": e["type"],
1218
+ "Confidence": e["confidence"],
1219
+ "Description": e["description"]
1220
+ } for e in entity_mentions])
1221
+
1222
+ # Sort by confidence
1223
+ entity_df = entity_df.sort_values("Confidence", ascending=False)
1224
+
1225
+ # Create a horizontal bar chart
1226
+ fig = px.bar(entity_df,
1227
+ x="Confidence",
1228
+ y="Entity Type",
1229
+ orientation='h',
1230
+ title="Entity Type Detection Confidence",
1231
+ color="Confidence",
1232
+ color_continuous_scale="Blues",
1233
+ text="Confidence")
1234
+
1235
+ fig.update_traces(texttemplate='%{text:.0%}', textposition='outside')
1236
+ fig.update_layout(xaxis_tickformat=".0%")
1237
+
1238
+ st.plotly_chart(fig, use_container_width=True)
1239
+
1240
+ # Display descriptions
1241
+ st.subheader("Entity Type Descriptions")
1242
+ st.dataframe(
1243
+ entity_df[["Entity Type", "Description"]],
1244
+ hide_index=True
1245
+ )
1246
+
1247
+ # Display detected relationships
1248
+ if relationship_mentions:
1249
+ st.markdown("### Relationships Detected in Query")
1250
+
1251
+ # Convert to DataFrame
1252
+ rel_df = pd.DataFrame([{
1253
+ "Relationship": r["name"],
1254
+ "From": r["domain"],
1255
+ "To": r["range"],
1256
+ "Confidence": r["confidence"],
1257
+ "Description": r["description"]
1258
+ } for r in relationship_mentions])
1259
+
1260
+ # Sort by confidence
1261
+ rel_df = rel_df.sort_values("Confidence", ascending=False)
1262
+
1263
+ # Create visualization
1264
+ fig = px.bar(rel_df,
1265
+ x="Confidence",
1266
+ y="Relationship",
1267
+ orientation='h',
1268
+ title="Relationship Detection Confidence",
1269
+ color="Confidence",
1270
+ color_continuous_scale="Reds",
1271
+ text="Confidence")
1272
+
1273
+ fig.update_traces(texttemplate='%{text:.0%}', textposition='outside')
1274
+ fig.update_layout(xaxis_tickformat=".0%")
1275
+
1276
+ st.plotly_chart(fig, use_container_width=True)
1277
+
1278
+ # Display relationship details
1279
+ st.subheader("Relationship Details")
1280
+ st.dataframe(
1281
+ rel_df[["Relationship", "From", "To", "Description"]],
1282
+ hide_index=True
1283
+ )
1284
+
1285
+ with tab2:
1286
+ # Create an enhanced visualization of the retrieval process
1287
+ st.markdown("### Knowledge Retrieval Process")
1288
+
1289
+ # Group retrieved documents by source
1290
+ docs_by_source = defaultdict(list)
1291
+ for doc in retrieved_docs:
1292
+ if hasattr(doc, 'metadata'):
1293
+ source = doc.metadata.get('source', 'unknown')
1294
+ docs_by_source[source].append(doc)
1295
+ else:
1296
+ docs_by_source['unknown'].append(doc)
1297
+
1298
+ # Display retrieval visualization
1299
+ col1, col2 = st.columns([2, 1])
1300
+
1301
+ with col1:
1302
+ # Create a Sankey diagram to show flow from query to sources to answer
1303
+ display_retrieval_flow(query, docs_by_source)
1304
+
1305
+ with col2:
1306
+ # Display source distribution
1307
+ source_counts = {source: len(docs) for source, docs in docs_by_source.items()}
1308
+
1309
+ # Create a pie chart
1310
+ fig = px.pie(
1311
+ values=list(source_counts.values()),
1312
+ names=list(source_counts.keys()),
1313
+ title="Retrieved Context Sources",
1314
+ color_discrete_sequence=px.colors.qualitative.Plotly
1315
+ )
1316
+
1317
+ st.plotly_chart(fig, use_container_width=True)
1318
+
1319
+ # Display retrieved document details in expandable sections
1320
+ for source, docs in docs_by_source.items():
1321
+ with st.expander(f"{source.capitalize()} ({len(docs)})", expanded=source == "ontology_context"):
1322
+ for i, doc in enumerate(docs):
1323
+ # Add separator between documents
1324
+ if i > 0:
1325
+ st.markdown("---")
1326
+
1327
+ # Display document content
1328
+ if hasattr(doc, 'page_content'):
1329
+ st.markdown(f"**Content:**")
1330
+
1331
+ # Format depending on source
1332
+ if source in ["ontology", "ontology_context"]:
1333
+ st.markdown(doc.page_content)
1334
+ else:
1335
+ st.code(doc.page_content)
1336
+
1337
+ # Display metadata if present
1338
+ if hasattr(doc, 'metadata') and doc.metadata:
1339
+ st.markdown("**Metadata:**")
1340
+ for key, value in doc.metadata.items():
1341
+ if key != 'source': # Already shown in section title
1342
+ st.markdown(f"- **{key}**: {value}")
1343
+
1344
+ with tab3:
1345
+ # Show the reasoning flow from query to answer
1346
+ st.markdown("### Ontological Reasoning Process")
1347
+
1348
+ # Display reasoning steps
1349
+ reasoning_steps = generate_reasoning_steps(query, entity_mentions, relationship_mentions, retrieved_docs, answer)
1350
+
1351
+ for i, step in enumerate(reasoning_steps):
1352
+ with st.expander(f"Step {i+1}: {step['title']}", expanded=i == 0):
1353
+ st.markdown(step["description"])
1354
+
1355
+ # Visualization of how ontological structure influenced the answer
1356
+ st.markdown("### How Ontology Enhanced the Answer")
1357
+
1358
+ # Display ontology advantage explanation
1359
+ advantages = explain_ontology_advantages(entity_mentions, relationship_mentions)
1360
+
1361
+ for adv in advantages:
1362
+ st.markdown(f"**{adv['title']}**")
1363
+ st.markdown(adv["description"])
1364
+
1365
+
1366
+ def analyze_query_ontology_concepts(query: str, ontology_manager) -> Tuple[List[Dict], List[Dict]]:
1367
+ """
1368
+ Analyze the query to identify ontology concepts with confidence scores.
1369
+ This is a simplified implementation that would be replaced with NLP in production.
1370
+ """
1371
+ query_lower = query.lower().split()
1372
+
1373
+ # Entity detection
1374
+ entity_mentions = []
1375
+ classes = ontology_manager.get_classes()
1376
+
1377
+ for class_name in classes:
1378
+ # Simple token matching (would use NER in production)
1379
+ if class_name.lower() in query_lower:
1380
+ # Get class info
1381
+ class_info = ontology_manager.ontology_data["classes"].get(class_name, {})
1382
+
1383
+ # Assign a confidence score (this would be from an ML model in production)
1384
+ # Here we use a simple heuristic based on word length and specificity
1385
+ confidence = min(0.95, 0.5 + (len(class_name) / 20))
1386
+
1387
+ entity_mentions.append({
1388
+ "type": class_name,
1389
+ "confidence": confidence,
1390
+ "description": class_info.get("description", "")
1391
+ })
1392
+
1393
+ # Relationship detection
1394
+ relationship_mentions = []
1395
+ relationships = ontology_manager.ontology_data.get("relationships", [])
1396
+
1397
+ for rel in relationships:
1398
+ rel_name = rel["name"]
1399
+
1400
+ # Simple token matching
1401
+ if rel_name.lower() in query_lower:
1402
+ # Assign confidence
1403
+ confidence = min(0.9, 0.5 + (len(rel_name) / 20))
1404
+
1405
+ relationship_mentions.append({
1406
+ "name": rel_name,
1407
+ "domain": rel["domain"],
1408
+ "range": rel["range"],
1409
+ "confidence": confidence,
1410
+ "description": rel.get("description", "")
1411
+ })
1412
+
1413
+ return entity_mentions, relationship_mentions
1414
+
1415
+
1416
+ def display_retrieval_flow(query: str, docs_by_source: Dict[str, List]):
1417
+ """Create a Sankey diagram showing the flow from query to sources to answer."""
1418
+ # Define node labels
1419
+ nodes = ["Query"]
1420
+
1421
+ # Add source nodes
1422
+ for source in docs_by_source.keys():
1423
+ nodes.append(f"Source: {source.capitalize()}")
1424
+
1425
+ nodes.append("Answer")
1426
+
1427
+ # Define links
1428
+ source_indices = []
1429
+ target_indices = []
1430
+ values = []
1431
+
1432
+ # Links from query to sources
1433
+ for i, (source, docs) in enumerate(docs_by_source.items()):
1434
+ source_indices.append(0) # Query is index 0
1435
+ target_indices.append(i + 1) # Source indices start at 1
1436
+ values.append(len(docs)) # Width based on number of docs
1437
+
1438
+ # Links from sources to answer
1439
+ for i in range(len(docs_by_source)):
1440
+ source_indices.append(i + 1) # Source index
1441
+ target_indices.append(len(nodes) - 1) # Answer is last node
1442
+ values.append(values[i]) # Same width as query to source
1443
+
1444
+ # Create Sankey diagram
1445
+ fig = go.Figure(data=[go.Sankey(
1446
+ node=dict(
1447
+ pad=15,
1448
+ thickness=20,
1449
+ line=dict(color="black", width=0.5),
1450
+ label=nodes,
1451
+ color=["#1f77b4"] + [px.colors.qualitative.Plotly[i % len(px.colors.qualitative.Plotly)]
1452
+ for i in range(len(docs_by_source))] + ["#2ca02c"]
1453
+ ),
1454
+ link=dict(
1455
+ source=source_indices,
1456
+ target=target_indices,
1457
+ value=values
1458
+ )
1459
+ )])
1460
+
1461
+ fig.update_layout(
1462
+ title="Information Flow in RAG Process",
1463
+ font=dict(size=12)
1464
+ )
1465
+
1466
+ st.plotly_chart(fig, use_container_width=True)
1467
+
1468
+
1469
+ def generate_reasoning_steps(query: str, entity_mentions: List[Dict], relationship_mentions: List[Dict],
1470
+ retrieved_docs: List[Dict], answer: str) -> List[Dict]:
1471
+ """Generate reasoning steps to explain how the system arrived at the answer."""
1472
+ steps = []
1473
+
1474
+ # Step 1: Query Understanding
1475
+ steps.append({
1476
+ "title": "Query Understanding",
1477
+ "description": f"""The system analyzes the query "{query}" and identifies key concepts from the ontology.
1478
+ {len(entity_mentions)} entity types and {len(relationship_mentions)} relationship types are recognized, allowing
1479
+ the system to understand the semantic context of the question."""
1480
+ })
1481
+
1482
+ # Step 2: Knowledge Retrieval
1483
+ if retrieved_docs:
1484
+ doc_count = len(retrieved_docs)
1485
+ ontology_count = sum(1 for doc in retrieved_docs if hasattr(doc, 'metadata') and
1486
+ doc.metadata.get('source', '') in ['ontology', 'ontology_context'])
1487
+
1488
+ steps.append({
1489
+ "title": "Knowledge Retrieval",
1490
+ "description": f"""Based on the identified concepts, the system retrieves {doc_count} relevant pieces of information,
1491
+ including {ontology_count} from the structured ontology. This hybrid approach combines traditional vector retrieval
1492
+ with ontology-aware semantic retrieval, enabling access to both explicit and implicit knowledge."""
1493
+ })
1494
+
1495
+ # Step 3: Relationship Traversal
1496
+ if relationship_mentions:
1497
+ rel_names = [r["name"] for r in relationship_mentions]
1498
+ steps.append({
1499
+ "title": "Relationship Traversal",
1500
+ "description": f"""The system identifies key relationships in the ontology: {', '.join(rel_names)}.
1501
+ By traversing these relationships, the system can connect concepts that might not appear together in the same text,
1502
+ allowing for multi-hop reasoning across the knowledge graph."""
1503
+ })
1504
+
1505
+ # Step 4: Ontological Inference
1506
+ if entity_mentions:
1507
+ entity_types = [e["type"] for e in entity_mentions]
1508
+ steps.append({
1509
+ "title": "Ontological Inference",
1510
+ "description": f"""Using the hierarchical structure of entities like {', '.join(entity_types)},
1511
+ the system makes inferences based on class inheritance and relationship constraints defined in the ontology.
1512
+ This allows it to reason about properties and relationships that might not be explicitly stated."""
1513
+ })
1514
+
1515
+ # Step 5: Answer Generation
1516
+ steps.append({
1517
+ "title": "Answer Synthesis",
1518
+ "description": f"""Finally, the system synthesizes the retrieved information and ontological knowledge to generate a comprehensive answer.
1519
+ The structured nature of the ontology ensures that the answer accurately reflects the relationships between concepts
1520
+ and respects the business rules defined in the knowledge model."""
1521
+ })
1522
+
1523
+ return steps
1524
+
1525
+
1526
+ def explain_ontology_advantages(entity_mentions: List[Dict], relationship_mentions: List[Dict]) -> List[Dict]:
1527
+ """Explain how ontology enhanced the RAG process."""
1528
+ advantages = []
1529
+
1530
+ if entity_mentions:
1531
+ advantages.append({
1532
+ "title": "Hierarchical Knowledge Representation",
1533
+ "description": """The ontology provides a hierarchical class structure that enables the system to understand
1534
+ that concepts are related through is-a relationships. For instance, knowing that a Manager is an Employee
1535
+ allows the system to apply Employee-related knowledge when answering questions about Managers, even if
1536
+ the specific information was only stated for Employees in general."""
1537
+ })
1538
+
1539
+ if relationship_mentions:
1540
+ advantages.append({
1541
+ "title": "Explicit Relationship Semantics",
1542
+ "description": """The ontology defines explicit relationships between concepts with clear semantics.
1543
+ This allows the system to understand how entities are connected beyond simple co-occurrence in text.
1544
+ For example, understanding that 'ownedBy' connects Products to Departments helps answer questions
1545
+ about product ownership and departmental responsibilities."""
1546
+ })
1547
+
1548
+ advantages.append({
1549
+ "title": "Constraint-Based Reasoning",
1550
+ "description": """Business rules in the ontology provide constraints that guide the reasoning process.
1551
+ These rules ensure the system's answers are consistent with the organization's policies and practices.
1552
+ For instance, rules about approval workflows or data classification requirements can inform answers
1553
+ about process-related questions."""
1554
+ })
1555
+
1556
+ advantages.append({
1557
+ "title": "Cross-Domain Knowledge Integration",
1558
+ "description": """The ontology connects concepts across different domains of the enterprise, enabling
1559
+ integrated reasoning that traditional document-based retrieval might miss. This allows the system to
1560
+ answer questions that span organizational boundaries, such as how marketing decisions affect product
1561
+ development or how customer feedback influences business strategy."""
1562
+ })
1563
+
1564
+ return advantages
static/css/styles.css ADDED
@@ -0,0 +1,83 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ /* Custom styling for ontology-RAG application */
2
+
3
+ /* Main container styles */
4
+ .main-container {
5
+ padding: 20px;
6
+ max-width: 1200px;
7
+ margin: 0 auto;
8
+ }
9
+
10
+ /* Enhance visualization elements */
11
+ .vis-network {
12
+ border: 1px solid #ddd;
13
+ border-radius: 8px;
14
+ box-shadow: 0 2px 8px rgba(0, 0, 0, 0.1);
15
+ }
16
+
17
+ /* Custom tooltip styling */
18
+ .vis-tooltip {
19
+ position: absolute;
20
+ background-color: rgba(255, 255, 255, 0.95);
21
+ border: 1px solid #ccc;
22
+ border-radius: 5px;
23
+ padding: 12px;
24
+ font-family: Arial, sans-serif;
25
+ font-size: 13px;
26
+ color: #333;
27
+ max-width: 350px;
28
+ z-index: 9999;
29
+ box-shadow: 0 4px 8px rgba(0, 0, 0, 0.15);
30
+ }
31
+
32
+ /* Enhance legend appearance */
33
+ .graph-legend {
34
+ background-color: rgba(255, 255, 255, 0.9) !important;
35
+ border: 1px solid #eee !important;
36
+ border-radius: 8px !important;
37
+ box-shadow: 0 2px 6px rgba(0, 0, 0, 0.1) !important;
38
+ }
39
+
40
+ /* Styling for entity detail cards */
41
+ .entity-detail-card {
42
+ border: 1px solid #eee;
43
+ border-radius: 5px;
44
+ padding: 15px;
45
+ margin-bottom: 15px;
46
+ box-shadow: 0 2px 4px rgba(0, 0, 0, 0.05);
47
+ }
48
+
49
+ /* Highlight for central entities */
50
+ .central-entity {
51
+ border-left: 4px solid #ff7f0e;
52
+ padding-left: 12px;
53
+ }
54
+
55
+ /* Enhanced path visualization */
56
+ .path-step {
57
+ padding: 8px;
58
+ margin: 8px 0;
59
+ border-left: 3px solid #1f77b4;
60
+ background-color: #f8f9fa;
61
+ }
62
+
63
+ /* Customization for Streamlit components */
64
+ .stButton button {
65
+ border-radius: 20px;
66
+ padding: 5px 15px;
67
+ }
68
+
69
+ .stSelectbox label {
70
+ font-weight: 500;
71
+ }
72
+
73
+ /* Tabs customization */
74
+ .streamlit-tabs .stTabs [role="tab"] {
75
+ font-size: 15px;
76
+ padding: 8px 16px;
77
+ }
78
+
79
+ /* Expander customization */
80
+ .streamlit-expanderContent {
81
+ border-left: 1px solid #ddd;
82
+ padding-left: 10px;
83
+ }