Spaces:

AD2000X
/

Ontology-RAG-Demo

Running

App Files Files Community

AD2000X commited on Apr 1

Commit

e1cced0

verified ·

1 Parent(s): 296c836

Upload 14 files

Browse files

Files changed (14) hide show

.streamlit/config.toml +13 -0
DEPLOYMENT_GUIDE.md +139 -0
README.md +188 -13
app.py +683 -0
data/enterprise_ontology.json +771 -0
data/enterprise_ontology.txt +10 -0
huggingface.yml +8 -0
requirements.txt +14 -0
src/__init__.py +1 -0
src/knowledge_graph.py +920 -0
src/ontology_manager.py +440 -0
src/semantic_retriever.py +233 -0
src/visualization.py +1564 -0
static/css/styles.css +83 -0

.streamlit/config.toml ADDED Viewed

	@@ -0,0 +1,13 @@

+[server]
+headless = true
+enableCORS = false
+[browser]
+gatherUsageStats = false
+[theme]
+primaryColor = "#4B6BFF"
+backgroundColor = "#FAFAFA"
+secondaryBackgroundColor = "#F0F2F6"
+textColor = "#262730"
+font = "sans serif"

DEPLOYMENT_GUIDE.md ADDED Viewed

	@@ -0,0 +1,139 @@

+# Deployment Guide for Ontology-Enhanced RAG System
+This guide will help you deploy the Ontology-Enhanced RAG demonstration to Hugging Face Spaces.
+## Prerequisites
+1. **Hugging Face Account**: You need a Hugging Face account.
+2. **OpenAI API Key**: You need a valid OpenAI API key.
+## Deployment Steps
+### 1. Prepare Your Repository
+Ensure your repository contains the following files and directories:
+- `app.py`: Main Streamlit application
+- `src/`: Directory containing all source code
+- `data/`: Directory containing the ontology JSON and other data
+- `.streamlit/`: Directory containing Streamlit configuration
+- `static/`: Directory containing CSS and other static assets
+- `requirements.txt`: List of all dependencies
+- `huggingface.yml`: Hugging Face Space configuration
+### 2. Set Up Hugging Face Space
+1. Visit [Hugging Face](https://huggingface.co/) and log in
+2. Click "New" → "Space" in the top right corner
+3. Fill in the Space settings:
+   - **Owner**: Select your username or organization
+   - **Space name**: Choose a name for your demo, e.g., "ontology-rag-demo"
+   - **License**: Choose MIT or your preferred license
+   - **SDK**: Select Streamlit
+   - **Space hardware**: Choose according to your needs (minimum requirement: CPU + 4GB RAM)
+4. Click "Create Space"
+### 3. Configure Space Secrets
+You need to add your OpenAI API key as a Secret:
+1. In your Space page, go to the "Settings" tab
+2. Scroll down to the "Repository secrets" section
+3. Click "New secret"
+4. Add the following secret:
+   - **Name**: `OPENAI_API_KEY`
+   - **Value**: Your OpenAI API key
+5. Click "Add secret"
+### 4. Upload Your Code
+There are two ways to upload your code:
+#### Option A: Upload via Web Interface
+1. In your Space page, go to the "Files" tab
+2. Use the upload button to upload all necessary files and directories
+3. Ensure you maintain the correct directory structure
+#### Option B: Upload via Git (Recommended)
+1. Clone your Space repository:
+   ```bash
+   git clone https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME
+   ```
+2. Copy all your files into the cloned repository
+3. Add, commit, and push the changes:
+   ```bash
+   git add .
+   git commit -m "Initial commit"
+   git push
+   ```
+### 5. Verify Deployment
+1. Visit your Space URL (in the format `https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME`)
+2. Confirm that the application loads and runs correctly
+3. Test all features
+## Hardware Recommendations
+For optimal performance, consider the following hardware configurations:
+- **Minimal**: CPU + 4GB RAM (suitable for demos with limited users)
+- **Recommended**: CPU + 16GB RAM (for better performance with knowledge graph visualizations)
+## Troubleshooting
+If you encounter issues:
+1. **Application fails to start**:
+   - Check if the Streamlit version is compatible
+   - Verify all dependencies are correctly installed
+   - Check the Space logs for error messages
+2. **OpenAI API errors**:
+   - Confirm the API key is correctly set as a Secret
+   - Verify the API key is valid and has sufficient quota
+3. **Display issues**:
+   - Try simplifying visualizations, as they might be memory-intensive
+   - Check logs for any warnings or errors
+4. **NetworkX or Visualization Issues**:
+   - Ensure pygraphviz is properly installed
+   - For simpler deployment, you can modify the code to use alternative layout algorithms that don't depend on Graphviz
+## Deployment Optimizations
+For production deployments, consider these optimizations:
+1. **Resource Management**:
+   - Choose appropriate hardware (CPU+RAM) to meet your application's needs
+   - Consider optimizing large visualizations to reduce memory usage
+2. **Performance**:
+   - Implement result caching for common queries
+   - Consider pre-computing common graph layouts
+3. **Security**:
+   - Ensure no sensitive data is stored in the codebase
+   - Store all credentials using environment variables or Secrets
+## Memory Optimization Tips
+If you encounter memory issues with large ontologies:
+1. Limit the maximum number of nodes in visualization
+2. Implement pagination for large result sets
+3. Use streaming responses for large text outputs
+4. Optimize NetworkX operations for large graphs
+## Additional Resources
+- [Streamlit Deployment Documentation](https://docs.streamlit.io/streamlit-community-cloud/get-started)
+- [Hugging Face Spaces Documentation](https://huggingface.co/docs/hub/spaces)
+- [OpenAI API Documentation](https://platform.openai.com/docs/api-reference)
+- [NetworkX Documentation](https://networkx.org/documentation/stable/)
+- [FAISS Documentation](https://github.com/facebookresearch/faiss/wiki)

README.md CHANGED Viewed

@@ -1,13 +1,188 @@
----
-title: Ontology RAG Demo
-emoji: 🚀
-colorFrom: purple
-colorTo: gray
-sdk: streamlit
-sdk_version: 1.44.1
-app_file: app.py
-pinned: false
-license: mit
----
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

+# Enhanced Ontology-RAG System
+ontology-rag/
+├── .streamlit/
+│   └── config.toml             # Streamlit configuration
+├── data/
+│   └── enterprise_ontology.json  # Enterprise ontology data
+│   └── enterprise_ontology.txt   # Simplified text representation of enterprise ontology
+├── src/
+│   ├── __init__.py
+│   ├── knowledge_graph.py      # Knowledge graph processing
+│   ├── ontology_manager.py     # Ontology management
+│   ├── semantic_retriever.py   # Semantic retrieval
+│   └── visualization.py        # Visualization function
+├── static/
+│   └── css/
+│       └── styles.css         # Custom styles
+├── app.py                      # Main application
+├── requirements.txt            # Dependency list
+├── README.md                   # Project descriptio
+└── huggingface.yml             # Hugging Face Space configuration
+## Project Overview
+This repository contains an advanced Retrieval-Augmented Generation (RAG) system that integrates structured ontologies with language models. The system demonstrates how formal ontological knowledge representation can enhance traditional vector-based retrieval methods to provide more accurate, contextually rich, and logically consistent answers to user queries.
+The project implements a sophisticated architecture that combines:
+- JSON-based ontology representation with classes, relationships, rules, and instances
+- Knowledge graph visualization for exploring entity relationships
+- Semantic path finding for multi-hop reasoning between concepts
+- Comparative analysis between traditional vector-based RAG and ontology-enhanced RAG
+The application is built with **Streamlit** for the frontend interface, uses **FAISS** for vector embeddings, **NetworkX** for graph representation, and integrates with **OpenAI's language models** for generating responses.
+## Key Features
+1. **RAG Comparison Demo**
+   - Side-by-side comparison of traditional and ontology-enhanced RAG
+   - Analysis of differences in answers and retrieved context
+2. **Knowledge Graph Visualization**
+   - Interactive network graph for exploring the ontology structure
+   - Multiple layout algorithms (force-directed, hierarchical, radial, circular)
+   - Entity relationship exploration with customizable focus
+3. **Ontology Structure Analysis**
+   - Visualization of class hierarchies and statistics
+   - Relationship usage and domain-range distribution analysis
+   - Graph statistics including node counts, edge counts, and centrality metrics
+4. **Entity Exploration**
+   - Detailed entity information cards showing properties and relationships
+   - Relationship graphs centered on specific entities
+   - Neighborhood exploration for entities
+5. **Semantic Path Visualization**
+   - Path visualization between entities with step-by-step explanation
+   - Visual representation of paths through the knowledge graph
+   - Connection to relevant business rules
+6. **Reasoning Trace Visualization**
+   - Query analysis with entity and relationship detection
+   - Sankey diagrams showing information flow in the RAG process
+   - Explanation of reasoning steps
+## Ontology Structure Example
+The `data/enterprise_ontology.json` file contains a rich enterprise ontology that models organizational knowledge. Here's a breakdown of its key components:
+### Classes (Entity Types)
+The ontology defines a hierarchical class structure with inheritance relationships. For example:
+- **Entity** (base class)
+  - **FinancialEntity** → Budget, Revenue, Expense
+  - **Asset** → PhysicalAsset, DigitalAsset, IntellectualProperty
+  - **Person** → InternalPerson → Employee → Manager
+  - **Process** → BusinessProcess, DevelopmentProcess, SupportProcess
+  - **Market** → GeographicMarket, DemographicMarket, BusinessMarket
+Each class has a description and a set of defined properties. For instance, the `Employee` class includes properties like role, hire date, and performance rating.
+### Relationships
+The ontology defines explicit relationships between entity types, including:
+- `ownedBy`: Connects Product to Department
+- `managedBy`: Connects Department to Manager
+- `worksOn`: Connects Employee to Product
+- `purchases`: Connects Customer to Product
+- `provides`: Connects Customer to Feedback
+- `optimizedBy`: Relates Product to Feedback
+Each relationship has metadata such as domain, range, cardinality, and inverse relationship name.
+### Business Rules
+The ontology contains formal business rules that constrain the knowledge model:
+- "Every Product must be owned by exactly one Department"
+- "Every Department must be managed by exactly one Manager"
+- "Critical support tickets must be assigned to Senior employees or managers"
+- "Product Lifecycle stages must follow a predefined sequence"
+### Instances
+The ontology includes concrete instances of the defined classes, such as:
+- `product1`: An "Enterprise Analytics Suite" owned by the Engineering department
+- `manager1`: A director named "Jane Smith" who manages the Engineering department
+- `customer1`: "Acme Corp" who has purchased product1 and provided feedback
+Each instance has properties and relationships to other instances, forming a connected knowledge graph.
+This structured knowledge representation allows the system to perform semantic reasoning beyond what would be possible with simple text-based approaches, enabling it to answer complex queries that require understanding of hierarchical relationships, business rules, and multi-step connections between entities.
+## Getting Started
+### Prerequisites
+- Python 3.8+
+- OpenAI API key
+### Installation
+1. Clone this repository
+2. Install the required dependencies:
+   ```
+   pip install -r requirements.txt
+   ```
+3. Set up your OpenAI API key as an environment variable or in the Streamlit secrets
+### Running the Application
+To run the application locally:
+```
+streamlit run app.py
+```
+For deployment instructions, please refer to the [DEPLOYMENT_GUIDE.md](DEPLOYMENT_GUIDE.md).
+## Project Structure
+```
+ontology-rag/
+├── .streamlit/
+│   └── config.toml             # Streamlit configuration
+├── data/
+│   └── enterprise_ontology.json  # Enterprise ontology data
+│   └── enterprise_ontology.txt   # Simplified text representation of ontology
+├── src/
+│   ├── __init__.py
+│   ├── knowledge_graph.py      # Knowledge graph processing
+│   ├── ontology_manager.py     # Ontology management
+│   ├── semantic_retriever.py   # Semantic retrieval
+│   └── visualization.py        # Visualization functions
+├── static/
+│   └── css/
+│       └── styles.css          # Custom styles
+├── app.py                      # Main application
+├── requirements.txt            # Dependencies list
+├── DEPLOYMENT_GUIDE.md         # Deployment instructions
+└── README.md                   # This file
+```
+## Use Cases
+### Enterprise Knowledge Management
+The ontology-enhanced RAG system can help organizations effectively organize and access their knowledge assets, connecting information across different departments and systems to provide more comprehensive business insights.
+### Product Development Decision Support
+By understanding the relationships between customer feedback, product features, and market data, the system can provide more valuable support for product development decisions.
+### Complex Compliance Queries
+In compliance scenarios where multiple rules and relationships need to be considered, the ontology-enhanced RAG can provide rule-based reasoning to ensure recommendations comply with all applicable policies and regulations.
+### Diagnostics and Troubleshooting
+In technical support and troubleshooting scenarios, the system can connect symptoms, causes, and solutions through multi-hop reasoning to provide more accurate diagnoses.
+## Acknowledgments
+This project demonstrates the integration of ontological knowledge with RAG systems for enhanced query answering capabilities. It builds upon research in knowledge graphs, semantic web technologies, and large language models.
+## License
+This project is licensed under the MIT License - see the license file for details.

app.py ADDED Viewed

	@@ -0,0 +1,683 @@

+import streamlit as st
+st.set_page_config(page_title="Ontology RAG Demo", layout="wide")
+import os
+from src.semantic_retriever import SemanticRetriever
+from src.ontology_manager import OntologyManager
+from src.knowledge_graph import KnowledgeGraph
+from src.visualization import (display_ontology_stats, display_entity_details,
+                              display_graph_visualization, visualize_path,
+                              display_reasoning_trace, render_html_in_streamlit)
+import networkx as nx
+from openai import OpenAI
+import json
+# Setup
+llm = OpenAI(api_key=st.secrets["OPENAI_API_KEY"])
+ontology_manager = OntologyManager("data/enterprise_ontology.json")
+semantic_retriever = SemanticRetriever(ontology_manager=ontology_manager)
+knowledge_graph = KnowledgeGraph(ontology_manager=ontology_manager)
+k_val = st.sidebar.slider("Top K Results", 1, 10, 3)
+def main():
+    # Page Navigation
+    st.sidebar.title("Page Navigation")
+    page = st.sidebar.selectbox(
+        "Select function",
+        ["RAG comparison demonstration", "Knowledge graph visualization", "Ontology structure analysis", "Entity exploration", "Semantic path visualization", "Inference tracking", "Detailed comparative analysis"]
+    )
+    if page == "RAG Comparison Demo":
+        run_rag_demo()
+    elif page == "Knowledge Graph Visualization":
+        run_knowledge_graph_visualization()
+    elif page == "Ontology Structure Analysis":
+        run_ontology_structure_analysis()
+    elif page == "Entity Exploration":
+        run_entity_exploration()
+    elif page == "Semantic Path Visualization":
+        run_semantic_path_visualization()
+    elif page == "Inference Tracking":
+        run_reasoning_trace()
+    elif page == "Detailed comparative analysis":
+        run_detailed_comparison()
+def run_rag_demo():
+    st.title("Ontology Enhanced RAG Demonstration")
+    query = st.text_input(
+        "Enter a question to compare RAG methods:",
+        "How does customer feedback influence product development?"
+    )
+    if query:
+        col1, col2 = st.columns(2)
+        with st.spinner("Run two RAG methods..."):
+            # Traditional RAG
+            with col1:
+                st.subheader("Traditional RAG")
+                vector_docs = semantic_retriever.vector_store.similarity_search(query, k=k_val)
+                vector_context = "\n\n".join([doc.page_content for doc in vector_docs])
+                vector_messages = [
+                    {"role": "system", "content": f"You are an enterprise knowledge assistant...\nContext:\n{vector_context}"},
+                    {"role": "user", "content": query}
+                ]
+                vector_response = llm.chat.completions.create(
+                    model="gpt-3.5-turbo",
+                    messages=vector_messages
+                )
+                vector_answer = vector_response.choices[0].message.content
+                st.markdown("#### answer")
+                st.write(vector_answer)
+                st.markdown("#### retrieval context")
+                for i, doc in enumerate(vector_docs):
+                    with st.expander(f"Source {i+1}"):
+                        st.code(doc.page_content)
+            # # Ontology RAG
+            with col2:
+                st.subheader("Ontology RAG")
+                result = semantic_retriever.retrieve_with_paths(query, k=k_val)
+                retrieved_docs = result["documents"]
+                enhanced_context = "\n\n".join([doc.page_content for doc in retrieved_docs])
+                enhanced_messages = [
+                    {"role": "system", "content": f"You are an enterprise knowledge assistant with ontology access rights...\nContext:\n{enhanced_context}"},
+                    {"role": "user", "content": query}
+                ]
+                enhanced_response = llm.chat.completions.create(
+                    model="gpt-3.5-turbo",
+                    messages=enhanced_messages
+                )
+                enhanced_answer = enhanced_response.choices[0].message.content
+                st.markdown("#### answer")
+                st.write(enhanced_answer)
+                st.markdown("#### Search context")
+                for i, doc in enumerate(retrieved_docs):
+                    source = doc.metadata.get("source", "unknown")
+                    label = {
+                        "ontology": "Ontology context",
+                        "text": "Text context",
+                        "ontology_context": "Semantic context",
+                        "semantic_path": "Relationship path"
+                    }.get(source, f"source")
+                    with st.expander(f"{label} {i+1}"):
+                        st.markdown(doc.page_content)
+                # Store for reasoning trace visualization
+                st.session_state.query = query
+                st.session_state.retrieved_docs = retrieved_docs
+                st.session_state.answer = enhanced_answer
+        # Difference Analysis
+        st.markdown("---")
+        st.subheader("Difference Analysis")
+        st.markdown("""
+        The above comparison demonstrates several key advantages of ontology-enhanced RAG:
+        1. **Structure-aware**: Ontology-augmented methods understand the relationships between entities, not just their textual similarities.
+        2. **Multi-hop reasoning**: By using the knowledge graph structure, the enhancement method can connect information across multiple relational jumps.
+        3. **Context enrichment**: Ontologies provide additional context about entity types, attributes, and relationships that are not explicit in the text.
+        4. Reasoning ability: Structured knowledge allows for logical reasoning that vector similarity alone cannot achieve.
+        Try more complex queries that require understanding of relationships to see the differences more clearly!
+        """)
+def run_knowledge_graph_visualization():
+    st.title("Knowledge Graph Visualization")
+    # Check if there is a center entity selected
+    central_entity = st.session_state.get('central_entity', None)
+    # Check if there is a center entity selected
+    display_graph_visualization(knowledge_graph, central_entity=central_entity, max_distance=2)
+    # Get and display graphical statistics
+    graph_stats = knowledge_graph.get_graph_statistics()
+    if graph_stats:
+        st.subheader("Graphical Statistics")
+        col1, col2, col3, col4 = st.columns(4)
+        col1.metric("Total number of nodes", graph_stats.get("node_count", 0))
+        col2.metric("Total number of edges", graph_stats.get("edge_count", 0))
+        col3.metric("total number of classes", graph_stats.get("class_count", 0))
+        col4.metric("Total number of instances", graph_stats.get("instance_count", 0))
+        # Display the central node
+        if "central_nodes" in graph_stats and graph_stats["central_nodes"]:
+            st.subheader("Central Nodes (by Betweenness Centrality)")
+            central_nodes = graph_stats["central_nodes"]["betweenness"]
+            nodes_df = []
+            for node_info in central_nodes:
+                node_id = node_info["node"]
+                node_data = knowledge_graph.graph.nodes.get(node_id, {})
+                node_type = node_data.get("type", "unknown")
+                if node_type == "instance":
+                    node_class = node_data.get("class_type", "unknown")
+                    properties = node_data.get("properties", {})
+                    name = properties.get("name", node_id)
+                    nodes_df.append({
+                        "ID": node_id,
+                        "Name": name,
+                        "type": node_class,
+                        "Centrality": node_info["centrality"]
+                    })
+            st.table(nodes_df)
+def run_ontology_structure_analysis():
+    st.title("Ontology Structure Analysis")
+    # Use the existing ontology statistics display function
+    display_ontology_stats(ontology_manager)
+    # Add additional class hierarchy visualization
+    st.subheader("class hierarchy")
+    # Get class hierarchy data
+    class_hierarchy = ontology_manager.get_class_hierarchy()
+    # Create a NetworkX graph to represent the class hierarchy
+    G = nx.DiGraph()
+    # Add nodes and edges
+    for parent, children in class_hierarchy.items():
+        if not G.has_node(parent):
+            G.add_node(parent)
+        for child in children:
+            G.add_node(child)
+            G.add_edge(parent, child)
+    # Check if there are enough nodes to create the visualization
+    if len(G.nodes) > 1:
+        # Generate HTML visualization using knowledge graph class
+        kg = KnowledgeGraph(ontology_manager)
+        html = kg.generate_html_visualization(
+            include_classes=True,
+            include_instances=False,
+            max_distance=5,
+            layout_algorithm="hierarchical"
+        )
+        # Rendering HTML
+        render_html_in_streamlit(html)
+def run_entity_exploration():
+    st.title("Entity Exploration")
+    # Get all entities
+    entities = []
+    for class_name in ontology_manager.get_classes():
+        entities.extend(ontology_manager.get_instances_of_class(class_name))
+    # Remove duplicates and sort
+    entities = sorted(set(entities))
+    # Create a drop-down selection box
+    selected_entity = st.selectbox("Select entity", entities)
+    if selected_entity:
+        # Get entity information
+        entity_info = ontology_manager.get_entity_info(selected_entity)
+        # Display detailed information
+        display_entity_details(entity_info, ontology_manager)
+        # Set this entity as the central entity (for knowledge graph visualization)
+        if st.button("View this entity in the knowledge graph"):
+            st.session_state.central_entity = selected_entity
+            st.rerun()
+        # Get and display entity neighbors
+        st.subheader("Entity Neighborhood")
+        max_distance = st.slider("Maximum neighborhood distance", 1, 3, 1)
+        neighborhood = knowledge_graph.get_entity_neighborhood(
+            selected_entity,
+            max_distance=max_distance,
+            include_classes=True
+        )
+        if neighborhood and "neighbors" in neighborhood:
+            # Display neighbors grouped by distance
+            for distance in range(1, max_distance+1):
+                neighbors_at_distance = [n for n in neighborhood["neighbors"] if n["distance"] == distance]
+                if neighbors_at_distance:
+                    with st.expander(f"Neighbors at distance {distance} ({len(neighbors_at_distance)})"):
+                        for neighbor in neighbors_at_distance:
+                            st.markdown(f"**{neighbor['id']}** ({neighbor.get('class_type', 'unknown')})")
+                            # Display relations
+                            for relation in neighbor.get("relations", []):
+                                direction = "→" if relation["direction"] == "outgoing" else "←"
+                                st.markdown(f"- {direction} {relation['type']}")
+                            st.markdown("---")
+def run_semantic_path_visualization():
+    st.title("Semantic Path Visualization")
+    # Get all entities
+    entities = []
+    for class_name in ontology_manager.get_classes():
+        entities.extend(ontology_manager.get_instances_of_class(class_name))
+    # Remove duplicates and sort
+    entities = sorted(set(entities))
+    # Create two columns for selecting source and target entities
+    col1, col2 = st.columns(2)
+    with col1:
+        source_entity = st.selectbox("Select source entity", entities, key="source")
+    with col2:
+        target_entity = st.selectbox("Select target entity", entities, key="target")
+    if source_entity and target_entity and source_entity != target_entity:
+        # Provide a maximum path length option
+        max_length = st.slider("Maximum path length", 1, 5, 3)
+        # Find the path
+        paths = knowledge_graph.find_paths_between_entities(
+            source_entity,
+            target_entity,
+            max_length=max_length
+        )
+        if paths:
+            st.success(f"Found {len(paths)} paths！")
+            # Create expanders for each path
+            for i, path in enumerate(paths):
+                # Calculate path length and relationship type
+                path_length = len(path)
+                rel_types = [edge["type"] for edge in path]
+                with st.expander(f"path {i+1} (length: {path_length}, relation: {', '.join(rel_types)})", expanded=(i==0)):
+                    # Create a text description of the path
+                    path_text = []
+                    entities_in_path = []
+                    for edge in path:
+                        source = edge["source"]
+                        target = edge["target"]
+                        relation = edge["type"]
+                        entities_in_path.append(source)
+                        entities_in_path.append(target)
+                        # Get entity information to get a human-readable name
+                        source_info = ontology_manager.get_entity_info(source)
+                        target_info = ontology_manager.get_entity_info(target)
+                        source_name = source
+                        if "properties" in source_info and "name" in source_info["properties"]:
+                            source_name = source_info["properties"]["name"]
+                        target_name = target
+                        if "properties" in target_info and "name" in target_info["properties"]:
+                            target_name = target_info["properties"]["name"]
+                        path_text.append(f"{source_name} ({source}) **{relation}** {target_name} ({target})")
+                    # Display path description
+                    st.markdown(" → ".join(path_text))
+                    # Prepare path visualization
+                    path_info = {
+                        "source": source_entity,
+                        "target": target_entity,
+                        "path": path,
+                        "text": " → ".join(path_text)
+                    }
+                    # Display path visualization
+                    visualize_path(path_info, ontology_manager)
+        else:
+            st.warning(f"No path of length {max_length} or shorter was found between these entities.")
+def run_reasoning_trace():
+    st.title("Inference Tracking Visualization")
+    if not st.session_state.get("query") or not st.session_state.get("retrieved_docs") or not st.session_state.get("answer"):
+        st.warning("Please run a query on the RAG comparison page first to generate inference trace data.")
+        return
+    # Get data from session state
+    query = st.session_state.query
+    retrieved_docs = st.session_state.retrieved_docs
+    answer = st.session_state.answer
+    # Show inference trace
+    display_reasoning_trace(query, retrieved_docs, answer, ontology_manager)
+def run_detailed_comparison():
+    st.title("Detailed comparison of RAG methods")
+    # Add comparison query options
+    comparison_queries = [
+        "How does customer feedback influence product development?",
+        "Which employees work in the Engineering department?",
+        "What are the product life cycle stages?",
+        "How do managers monitor employee performance?",
+        "What are the responsibilities of the marketing department?"
+    ]
+    selected_query = st.selectbox(
+        "Select Compare Query",
+        comparison_queries,
+        index=0
+    )
+    custom_query = st.text_input("Or enter a custom query:", "")
+    if custom_query:
+        query = custom_query
+    else:
+        query = selected_query
+    if st.button("Compare RAG methods"):
+        with st.spinner("Run detailed comparison..."):
+            # Start timing
+            import time
+            start_time = time.time()
+            # Run traditional RAG
+            vector_docs = semantic_retriever.vector_store.similarity_search(query, k=k_val)
+            vector_context = "\n\n".join([doc.page_content for doc in vector_docs])
+            vector_messages = [
+                {"role": "system", "content": f"You are an enterprise knowledge assistant...\nContext:\n{vector_context}"},
+                {"role": "user", "content": query}
+            ]
+            vector_response = llm.chat.completions.create(
+                model="gpt-3.5-turbo",
+                messages=vector_messages
+            )
+            vector_answer = vector_response.choices[0].message.content
+            vector_time = time.time() - start_time
+            # Reset the timer
+            start_time = time.time()
+            # Run the enhanced RAG
+            result = semantic_retriever.retrieve_with_paths(query, k=k_val)
+            retrieved_docs = result["documents"]
+            enhanced_context = "\n\n".join([doc.page_content for doc in retrieved_docs])
+            enhanced_messages = [
+                {"role": "system", "content": f"You are an enterprise knowledge assistant with ontology access rights...\nContext:\n{enhanced_context}"},
+                {"role": "user", "content": query}
+            ]
+            enhanced_response = llm.chat.completions.create(
+                model="gpt-3.5-turbo",
+                messages=enhanced_messages
+            )
+            enhanced_answer = enhanced_response.choices[0].message.content
+            enhanced_time = time.time() - start_time
+            # Save the results for visualization
+            st.session_state.query = query
+            st.session_state.retrieved_docs = retrieved_docs
+            st.session_state.answer = enhanced_answer
+            # Display the comparison results
+            st.subheader("Comparison results")
+            # Use tabs to show comparisons in different aspects
+            tab1, tab2, tab3, tab4 = st.tabs(["Answer Comparison", "Performance Indicators", "Retrieval Source Comparison", "Context Quality"])
+            with tab1:
+                col1, col2 = st.columns(2)
+                with col1:
+                    st.markdown("#### Traditional RAG answer")
+                    st.write(vector_answer)
+                with col2:
+                    st.markdown("#### Ontology Enhanced RAG Answer")
+                    st.write(enhanced_answer)
+            with tab2:
+                # Performance Indicators
+                col1, col2 = st.columns(2)
+                with col1:
+                    st.metric("Traditional RAG response time", f"{vector_time:.2f}秒")
+                    # Calculate text related indicators
+                    vector_tokens = len(vector_context.split())
+                    st.metric("Number of retrieved context tokens", vector_tokens)
+                    st.metric("Number of retrieved documents", len(vector_docs))
+                with col2:
+                    st.metric("Ontology enhanced RAG response time", f"{enhanced_time:.2f}秒")
+                    # Calculate text related indicators
+                    enhanced_tokens = len(enhanced_context.split())
+                    st.metric("Number of retrieved context tokens", enhanced_tokens)
+                    st.metric("Number of retrieved documents", len(retrieved_docs))
+                # Add a chart
+                import pandas as pd
+                import plotly.express as px
+                # Performance comparison chart
+                performance_data = {
+                    "Metrics": ["Response time (seconds)", "Number of context tags", "Number of retrieved documents"],
+                    "Traditional RAG": [vector_time, vector_tokens, len(vector_docs)],
+                    "Ontology Enhanced RAG": [enhanced_time, enhanced_tokens, len(retrieved_docs)]
+                }
+                df = pd.DataFrame(performance_data)
+                # Plotly bar chart
+                fig = px.bar(
+                    df,
+                    x="Indicator",
+                    y=["Traditional RAG", "Ontology Enhanced RAG"],
+                    barmode="group",
+                    title="Performance Index Comparison",
+                    labels={"value": "Numerical value", "variable": "RAG method"}
+                )
+                st.plotly_chart(fig)
+            with tab3:
+                # Search source comparison
+                traditional_sources = ["Traditional vector retrieval"] * len(vector_docs)
+                enhanced_sources = []
+                for doc in retrieved_docs:
+                    source = doc.metadata.get("source", "unknown")
+                    label = {
+                        "ontology": "Ontology context",
+                        "text": "Text context",
+                        "ontology_context": "Semantic context",
+                        "semantic_path": "Relationship path"
+                    }.get(source, "unknown source")
+                    enhanced_sources.append(label)
+                # Create a source distribution chart
+                source_counts = {}
+                for source in enhanced_sources:
+                    if source in source_counts:
+                        source_counts[source] += 1
+                    else:
+                        source_counts[source] = 1
+                source_df = pd.DataFrame({
+                    "Source type": list(source_counts.keys()),
+                    "Number of documents": list(source_counts.values())
+                })
+                fig = px.pie(
+                    source_df,
+                    values="Number of documents",
+                    names="Source type",
+                    title="Ontology-enhanced RAG retrieval source distribution"
+                )
+                st.plotly_chart(fig)
+                # Show the relationship between the source and the answer
+                st.subheader("Relationship between source and answer")
+                st.markdown("""
+                Ontology-enhanced methods leverage multiple sources of knowledge to construct more comprehensive answers. The figure above shows the distribution of different sources.
+                In particular, semantic context and relation paths provide knowledge that cannot be captured by traditional vector retrieval, enabling the system to connect concepts and perform multi-hop reasoning.
+                """)
+            with tab4:
+                # Contextual quality assessment
+                st.subheader("Contextual Quality Assessment")
+                # Create an evaluation function (simplified version)
+                def evaluate_context(docs):
+                    metrics = {
+                        "Direct Relevance": 0,
+                        "Semantic Richness": 0,
+                        "Structure Information": 0,
+                        "Relationship Information": 0
+                    }
+                    for doc in docs:
+                        content = doc.page_content if hasattr(doc, "page_content") else ""
+                        # Direct Relevance - Based on Keywords
+                        if any(kw in content.lower() for kw in query.lower().split()):
+                            metrics["direct relevance"] += 1
+                        # Semantic richness - based on text length
+                        metrics["semantic richness"] += min(1, len(content.split()) / 50)
+                        # Structural information - from the body
+                        if hasattr(doc, "metadata") and doc.metadata.get("source") in ["ontology", "ontology_context"]:
+                            metrics["Structure Information"] += 1
+                        # Relationship information - from path
+                        if hasattr(doc, "metadata") and doc.metadata.get("source") == "semantic_path":
+                            metrics["relationship information"] += 1
+                    # Standardization
+                    for key in metrics:
+                        metrics[key] = min(10, metrics[key])
+                    return metrics
+                # Evaluate the two methods
+                vector_metrics = evaluate_context(vector_docs)
+                enhanced_metrics = evaluate_context(retrieved_docs)
+                # Create a comparative radar chart
+                metrics_df = pd.DataFrame({
+                    "metrics": list(vector_metrics.keys()),
+                    "Traditional RAG": list(vector_metrics.values()),
+                    "Ontology Enhanced RAG": list(enhanced_metrics.values())
+                })
+                # Convert data to Plotly radar chart format
+                fig = px.line_polar(
+                    metrics_df,
+                    r=["Traditional RAG", "Ontology Enhanced RAG"],
+                    theta="Indicator",
+                    line_close=True,
+                    range_r=[0, 10],
+                    title="Contextual Quality Comparison"
+                )
+                st.plotly_chart(fig)
+                st.markdown("""
+                The figure above shows the comparison of the two RAG methods in terms of contextual quality. Ontology-enhanced RAG performs better in multiple dimensions:
+                1. **Direct relevance**: the degree of relevance between the search content and the query
+                2. **Semantic Richness**: Information density and richness of the retrieval context
+                3. **Structural information**: structured knowledge of entity types, attributes, and relationships
+                4. **Relationship information**: explicit relationships and connection paths between entities
+                The advantage of ontology-enhanced RAG is that it can retrieve structured knowledge and relational information, which are missing in traditional RAG methods.
+                """)
+        # Display detailed analysis section
+        st.subheader("Method Effect Analysis")
+        with st.expander("Comparison of advantages and disadvantages", expanded=True):
+            col1, col2 = st.columns(2)
+            with col1:
+                st.markdown("#### Traditional RAG")
+                st.markdown("""
+                **Advantages**:
+                - Simple implementation and light computational burden
+                - Works well with unstructured text
+                - Response times are usually faster
+                **Disadvantages**:
+                - Unable to capture relationships between entities
+                - Lack of context for structured knowledge
+                - Difficult to perform multi-hop reasoning
+                - Retrieval is mainly based on text similarity
+                """)
+            with col2:
+                st.markdown("#### Ontology Enhanced RAG")
+                st.markdown("""
+                **Advantages**:
+                - Ability to understand relationships and connections between entities
+                - Provides rich structured knowledge context
+                - Support multi-hop reasoning and path discovery
+                - Combining vector similarity and semantic relationship
+                **Disadvantages**:
+                - Higher implementation complexity
+                - Need to maintain the ontology model
+                - The computational overhead is relatively high
+                - Retrieval and inference times may be longer
+                """)
+        # Add usage scenario suggestions
+        with st.expander("Applicable scenarios"):
+            st.markdown("""
+            ### Traditional RAG applicable scenarios
+            - Simple fact-finding
+            - Unstructured document retrieval
+            - Applications with high response time requirements
+            - When the document content is clear and direct
+            ### Applicable scenarios for Ontology Enhanced RAG
+            - Complex knowledge association query
+            - Problems that require understanding of relationships between entities
+            - Applications that require cross-domain reasoning
+            - Enterprise Knowledge Management System
+            - Reasoning scenarios that require high accuracy and consistency
+            - Applications that require implicit knowledge discovery
+            """)
+        # Add practical application examples
+        with st.expander("Actual Application Case"):
+            st.markdown("""
+            ### Enterprise Knowledge Management
+            Ontology-enhanced RAG systems can help enterprises effectively organize and access their knowledge assets, connect information in different departments and systems, and provide more comprehensive business insights.
+            ### Product development decision support
+            By understanding the relationship between customer feedback, product features, and market data, the system can provide more valuable support for product development decisions.
+            ### Complex compliance query
+            In compliance problems that require consideration of multiple rules and relationships, ontology-enhanced RAG can provide rule-based reasoning, ensuring that recommendations comply with all applicable policies and regulations.
+            ### Diagnostics and Troubleshooting
+            In technical support and troubleshooting scenarios, the system can connect symptoms, causes, and solutions to provide more accurate diagnoses through multi-hop reasoning.
+            """)

data/enterprise_ontology.json ADDED Viewed

	@@ -0,0 +1,771 @@

+{
+    "rules": [
+      {
+        "id": "rule9",
+        "description": "Critical support tickets must be assigned to Senior employees or managers",
+        "constraint": "FORALL ?t WHERE type(?t, SupportTicket) AND property(?t, priority, 'Critical') AND relationship(?t, assignedTo, ?e) MUST type(?e, Manager) OR (type(?e, Employee) AND property(?e, experienceLevel, 'Senior'))"
+      },
+      {
+        "id": "rule10",
+        "description": "Project end date must be after its start date",
+        "constraint": "FORALL ?p WHERE type(?p, Project) AND property(?p, startDate, ?start) AND property(?p, endDate, ?end) MUST date(?end) > date(?start)"
+      }
+    ],
+    "classes": {
+      "FinancialEntity": {
+        "description": "An entity related to financial matters",
+        "subClassOf": "Entity",
+        "properties": ["amount", "currency", "fiscalYear", "quarter", "transactionDate"]
+      },
+      "Budget": {
+        "description": "A financial plan for a specified period",
+        "subClassOf": "FinancialEntity",
+        "properties": ["budgetId", "period", "departmentId", "plannedAmount", "actualAmount", "variance"]
+      },
+      "Revenue": {
+        "description": "Income generated from business activities",
+        "subClassOf": "FinancialEntity",
+        "properties": ["revenueId", "source", "productId", "recurring", "oneTime", "revenueType"]
+      },
+      "Expense": {
+        "description": "Cost incurred in business operations",
+        "subClassOf": "FinancialEntity",
+        "properties": ["expenseId", "category", "department", "approvedBy", "paymentStatus", "receiptUrl"]
+      },
+      "Asset": {
+        "description": "A resource with economic value",
+        "subClassOf": "Entity",
+        "properties": ["assetId", "acquisitionDate", "value", "depreciationSchedule", "currentValue", "location"]
+      },
+      "PhysicalAsset": {
+        "description": "A tangible asset with physical presence",
+        "subClassOf": "Asset",
+        "properties": ["serialNumber", "manufacturer", "model", "maintenanceSchedule", "condition"]
+      },
+      "DigitalAsset": {
+        "description": "An intangible digital asset",
+        "subClassOf": "Asset",
+        "properties": ["fileType", "storageLocation", "accessControl", "backupStatus", "version"]
+      },
+      "IntellectualProperty": {
+        "description": "Legal rights resulting from intellectual activity",
+        "subClassOf": "Asset",
+        "properties": ["ipType", "filingDate", "grantDate", "jurisdiction", "inventors", "expirationDate"]
+      },
+      "Location": {
+        "description": "A physical or virtual place",
+        "subClassOf": "Entity",
+        "properties": ["locationId", "address", "city", "state", "country", "postalCode", "geoCoordinates"]
+      },
+      "Facility": {
+        "description": "A physical building or site owned or operated by the organization",
+        "subClassOf": "Location",
+        "properties": ["facilityType", "squareFootage", "capacity", "operatingHours", "amenities", "securityLevel"]
+      },
+      "VirtualLocation": {
+        "description": "A digital space or environment",
+        "subClassOf": "Location",
+        "properties": ["url", "accessMethod", "hostingProvider", "virtualEnvironmentType", "availabilityStatus"]
+      },
+      "Market": {
+        "description": "A geographic or demographic target for products and services",
+        "subClassOf": "Entity",
+        "properties": ["marketId", "name", "geography", "demographics", "size", "growth", "competitiveIntensity"]
+      },
+      "GeographicMarket": {
+        "description": "A market defined by geographic boundaries",
+        "subClassOf": "Market",
+        "properties": ["region", "countries", "languages", "regulations", "culturalFactors"]
+      },
+      "DemographicMarket": {
+        "description": "A market defined by demographic characteristics",
+        "subClassOf": "Market",
+        "properties": ["ageRange", "income", "education", "occupation", "familyStatus", "interests"]
+      },
+      "BusinessMarket": {
+        "description": "A market consisting of business customers",
+        "subClassOf": "Market",
+        "properties": ["industryFocus", "companySize", "businessModel", "decisionMakers", "purchasingCriteria"]
+      },
+      "Campaign": {
+        "description": "A coordinated series of marketing activities",
+        "subClassOf": "Entity",
+        "properties": ["campaignId", "name", "objective", "startDate", "endDate", "budget", "targetAudience", "channels"]
+      },
+      "DigitalCampaign": {
+        "description": "A marketing campaign conducted through digital channels",
+        "subClassOf": "Campaign",
+        "properties": ["platforms", "contentTypes", "keywords", "tracking", "analytics", "automationWorkflows"]
+      },
+      "TraditionalCampaign": {
+        "description": "A marketing campaign conducted through traditional media",
+        "subClassOf": "Campaign",
+        "properties": ["mediaTypes", "adSizes", "placementSchedule", "production", "distributionMethod"]
+      },
+      "IntegratedCampaign": {
+        "description": "A campaign that spans multiple marketing channels",
+        "subClassOf": "Campaign",
+        "properties": ["channelMix", "messageConsistency", "crossChannelMetrics", "customerJourneyMap"]
+      },
+      "Process": {
+        "description": "A defined set of activities to accomplish a specific objective",
+        "subClassOf": "Entity",
+        "properties": ["processId", "name", "purpose", "owner", "inputs", "outputs", "steps", "metrics"]
+      },
+      "BusinessProcess": {
+        "description": "A process for conducting business operations",
+        "subClassOf": "Process",
+        "properties": ["businessFunction", "criticality", "maturityLevel", "automationLevel", "regulatoryRequirements"]
+      },
+      "DevelopmentProcess": {
+        "description": "A process for developing products or services",
+        "subClassOf": "Process",
+        "properties": ["methodology", "phases", "deliverables", "qualityGates", "tools", "repositories"]
+      },
+      "SupportProcess": {
+        "description": "A process for supporting customers or internal users",
+        "subClassOf": "Process",
+        "properties": ["serviceLevel", "escalationPath", "knowledgeBase", "ticketingSystem", "supportHours"]
+      },
+      "Skill": {
+        "description": "A learned capacity to perform a task",
+        "subClassOf": "Entity",
+        "properties": ["skillId", "name", "category", "proficiencyLevels", "certifications", "learningResources"]
+      },
+      "TechnicalSkill": {
+        "description": "A skill related to technology or technical processes",
+        "subClassOf": "Skill",
+        "properties": ["techCategory", "tools", "languages", "frameworks", "platforms", "compatibility"]
+      },
+      "SoftSkill": {
+        "description": "An interpersonal or non-technical skill",
+        "subClassOf": "Skill",
+        "properties": ["interpersonalArea", "communicationAspects", "leadershipComponents", "adaptabilityMetrics"]
+      },
+      "DomainSkill": {
+        "description": "Knowledge and expertise in a specific business domain",
+        "subClassOf": "Skill",
+        "properties": ["domain", "industrySpecific", "regulations", "bestPractices", "domainTerminology"]
+      },
+      "Objective": {
+        "description": "A goal or target to be achieved",
+        "subClassOf": "Entity",
+        "properties": ["objectiveId", "name", "description", "targetDate", "status", "priority", "owner", "metrics"]
+      },
+      "StrategicObjective": {
+        "description": "A high-level, long-term goal",
+        "subClassOf": "Objective",
+        "properties": ["strategyAlignment", "timeframe", "impactAreas", "successIndicators", "boardApproval"]
+      },
+      "TacticalObjective": {
+        "description": "A medium-term goal supporting strategic objectives",
+        "subClassOf": "Objective",
+        "properties": ["parentObjective", "implementationPlan", "resourceRequirements", "dependencies", "milestones"]
+      },
+      "OperationalObjective": {
+        "description": "A short-term, specific goal supporting tactical objectives",
+        "subClassOf": "Objective",
+        "properties": ["parentTacticalObjective", "assignedTeam", "dailyActivities", "progressTracking", "completionCriteria"]
+      },
+      "KPI": {
+        "description": "Key Performance Indicator for measuring success",
+        "subClassOf": "Entity",
+        "properties": ["kpiId", "name", "description", "category", "unit", "formula", "target", "actual", "frequency"]
+      },
+      "FinancialKPI": {
+        "description": "KPI measuring financial performance",
+        "subClassOf": "KPI",
+        "properties": ["financialCategory", "accountingStandard", "auditRequirement", "forecastAccuracy"]
+      },
+      "CustomerKPI": {
+        "description": "KPI measuring customer-related performance",
+        "subClassOf": "KPI",
+        "properties": ["customerSegment", "touchpoint", "journeyStage", "sentimentConnection", "loyaltyImpact"]
+      },
+      "OperationalKPI": {
+        "description": "KPI measuring operational efficiency",
+        "subClassOf": "KPI",
+        "properties": ["processArea", "qualityDimension", "productivityFactor", "resourceUtilization"]
+      },
+      "Risk": {
+        "description": "A potential event that could negatively impact objectives",
+        "subClassOf": "Entity",
+        "properties": ["riskId", "name", "description", "category", "probability", "impact", "status", "mitigationPlan"]
+      },
+      "FinancialRisk": {
+        "description": "Risk related to financial matters",
+        "subClassOf": "Risk",
+        "properties": ["financialExposure", "currencyFactors", "marketConditions", "hedgingStrategy", "insuranceCoverage"]
+      },
+      "OperationalRisk": {
+        "description": "Risk related to business operations",
+        "subClassOf": "Risk",
+        "properties": ["operationalArea", "processVulnerabilities", "systemDependencies", "staffingFactors", "recoveryPlan"]
+      },
+      "ComplianceRisk": {
+        "description": "Risk related to regulatory compliance",
+        "subClassOf": "Risk",
+        "properties": ["regulations", "jurisdictions", "reportingRequirements", "penaltyExposure", "complianceStatus"]
+      },
+      "Decision": {
+        "description": "A choice made between alternatives",
+        "subClassOf": "Entity",
+        "properties": ["decisionId", "name", "description", "date", "decisionMaker", "alternatives", "selectedOption", "rationale"]
+      },
+      "StrategicDecision": {
+        "description": "A decision affecting long-term direction",
+        "subClassOf": "Decision",
+        "properties": ["strategicImplications", "marketPosition", "competitiveAdvantage", "boardApproval", "communicationPlan"]
+      },
+      "TacticalDecision": {
+        "description": "A decision affecting medium-term operations",
+        "subClassOf": "Decision",
+        "properties": ["operationalImpact", "resourceAllocation", "implementationTimeline", "departmentalScope"]
+      },
+      "OperationalDecision": {
+        "description": "A day-to-day decision in business operations",
+        "subClassOf": "Decision",
+        "properties": ["decisionFrequency", "standardProcedure", "delegationLevel", "auditTrail"]
+      },
+      "Technology": {
+        "description": "A technical capability or system",
+        "subClassOf": "Entity",
+        "properties": ["technologyId", "name", "category", "version", "vendor", "maturityLevel", "supportStatus"]
+      },
+      "Hardware": {
+        "description": "Physical technological equipment",
+        "subClassOf": "Technology",
+        "properties": ["specifications", "formFactor", "powerRequirements", "connectivity", "lifecycle", "replacementSchedule"]
+      },
+      "Software": {
+        "description": "Computer programs and applications",
+        "subClassOf": "Technology",
+        "properties": ["programmingLanguage", "operatingSystem", "architecture", "apiDocumentation", "licensingModel", "updateFrequency"]
+      },
+      "Infrastructure": {
+        "description": "Foundational technology systems",
+        "subClassOf": "Technology",
+        "properties": ["deploymentModel", "scalability", "redundancy", "securityFeatures", "complianceCertifications", "capacityMetrics"]
+      },
+      "SecurityEntity": {
+        "description": "An entity related to security measures",
+        "subClassOf": "Entity",
+        "properties": ["securityId", "name", "type", "implementationDate", "lastReview", "responsibleParty", "status"]
+      },
+      "SecurityControl": {
+        "description": "A measure to mitigate security risks",
+        "subClassOf": "SecurityEntity",
+        "properties": ["controlCategory", "protectedAssets", "implementationLevel", "automationDegree", "verificationMethod", "exceptions"]
+      },
+      "SecurityIncident": {
+        "description": "An event that compromises security",
+        "subClassOf": "SecurityEntity",
+        "properties": ["incidentDate", "severity", "affectedSystems", "vector", "remediationSteps", "rootCause", "resolution"]
+      },
+      "SecurityPolicy": {
+        "description": "A documented security directive",
+        "subClassOf": "SecurityEntity",
+        "properties": ["policyScope", "requiredControls", "complianceRequirements", "exemptionProcess", "reviewSchedule", "enforcementMechanism"]
+      },
+      "Competency": {
+        "description": "A cluster of related abilities, knowledge, and skills",
+        "subClassOf": "Entity",
+        "properties": ["competencyId", "name", "category", "description", "importance", "requiredProficiency", "assessmentMethod"]
+      },
+      "ManagerialCompetency": {
+        "description": "Competency related to managing people and resources",
+        "subClassOf": "Competency",
+        "properties": ["leadershipAspects", "teamDevelopment", "decisionMaking", "conflictResolution", "changeManagement", "resourceOptimization"]
+      },
+      "TechnicalCompetency": {
+        "description": "Competency related to technical knowledge and skills",
+        "subClassOf": "Competency",
+        "properties": ["technicalDomain", "specializations", "toolProficiency", "problemSolvingApproach", "technicalLeadership", "knowledgeSharing"]
+      },
+      "BusinessCompetency": {
+        "description": "Competency related to business acumen and operations",
+        "subClassOf": "Competency",
+        "properties": ["businessAcumen", "industryKnowledge", "stakeholderManagement", "commercialAwareness", "strategicThinking", "resultsOrientation"]
+      },
+      "Stakeholder": {
+        "description": "An individual or group with interest in or influence over the organization",
+        "subClassOf": "Entity",
+        "properties": ["stakeholderId", "name", "type", "influence", "interest", "expectations", "engagementLevel", "communicationPreference"]
+      },
+      "InternalStakeholder": {
+        "description": "A stakeholder within the organization",
+        "subClassOf": "Stakeholder",
+        "properties": ["department", "role", "decisionAuthority", "projectInvolvement", "changeReadiness", "organizationalTenure"]
+      },
+      "ExternalStakeholder": {
+        "description": "A stakeholder outside the organization",
+        "subClassOf": "Stakeholder",
+        "properties": ["organization", "relationship", "contractualAgreements", "marketInfluence", "externalNetworks", "publicProfile"]
+      },
+      "RegulatoryStakeholder": {
+        "description": "A regulatory body or authority",
+        "subClassOf": "Stakeholder",
+        "properties": ["jurisdiction", "regulations", "enforcementPowers", "reportingRequirements", "auditFrequency", "complianceDeadlines"]
+      }
+    },
+    "relationships": [
+      {
+        "name": "ownedBy",
+        "domain": "Product",
+        "range": "Department",
+        "inverse": "owns",
+        "cardinality": "many-to-one",
+        "description": "Indicates which department owns a product"
+      },
+      {
+        "name": "managedBy",
+        "domain": "Department",
+        "range": "Manager",
+        "inverse": "manages",
+        "cardinality": "one-to-one",
+        "description": "Indicates which manager heads a department"
+      },
+      {
+        "name": "worksOn",
+        "domain": "Employee",
+        "range": "Product",
+        "inverse": "developedBy",
+        "cardinality": "many-to-many",
+        "description": "Indicates which products an employee works on"
+      },
+      {
+        "name": "purchases",
+        "domain": "Customer",
+        "range": "Product",
+        "inverse": "purchasedBy",
+        "cardinality": "many-to-many",
+        "description": "Indicates which products a customer has purchased"
+      },
+      {
+        "name": "provides",
+        "domain": "Customer",
+        "range": "Feedback",
+        "inverse": "providedBy",
+        "cardinality": "one-to-many",
+        "description": "Connects customers to their feedback submissions"
+      },
+      {
+        "name": "pertainsTo",
+        "domain": "Feedback",
+        "range": "Product",
+        "inverse": "hasFeedback",
+        "cardinality": "many-to-one",
+        "description": "Indicates which product a feedback item is about"
+      },
+      {
+        "name": "supports",
+        "domain": "Platform",
+        "range": "Product",
+        "inverse": "supportedBy",
+        "cardinality": "one-to-many",
+        "description": "Indicates which products are supported by the platform"
+      },
+      {
+        "name": "hasLifecycle",
+        "domain": "Product",
+        "range": "Lifecycle",
+        "inverse": "lifecycleOf",
+        "cardinality": "one-to-one",
+        "description": "Connects a product to its lifecycle information"
+      },
+      {
+        "name": "oversees",
+        "domain": "Manager",
+        "range": "Employee",
+        "inverse": "reportsToDirect",
+        "cardinality": "one-to-many",
+        "description": "Indicates which employees report to a manager"
+      },
+      {
+        "name": "optimizedBy",
+        "domain": "Product",
+        "range": "Feedback",
+        "inverse": "optimizes",
+        "cardinality": "many-to-many",
+        "description": "Indicates how feedback is used to optimize product development"
+      },
+      {
+        "name": "allocatesTo",
+        "domain": "Budget",
+        "range": "Department",
+        "inverse": "fundedBy",
+        "cardinality": "one-to-many",
+        "description": "Indicates which departments receive budget allocations"
+      },
+      {
+        "name": "generatesRevenue",
+        "domain": "Product",
+        "range": "Revenue",
+        "inverse": "generatedFrom",
+        "cardinality": "one-to-many",
+        "description": "Connects products to the revenue they generate"
+      },
+      {
+        "name": "incursExpense",
+        "domain": "Department",
+        "range": "Expense",
+        "inverse": "incurredBy",
+        "cardinality": "one-to-many",
+        "description": "Connects departments to their expenses"
+      },
+      {
+        "name": "locatedAt",
+        "domain": "PhysicalEntity",
+        "range": "Location",
+        "inverse": "houses",
+        "cardinality": "many-to-one",
+        "description": "Indicates where a physical entity is located"
+      },
+      {
+        "name": "targetedAt",
+        "domain": "Campaign",
+        "range": "Market",
+        "inverse": "targetedBy",
+        "cardinality": "many-to-many",
+        "description": "Indicates which markets a campaign targets"
+      },
+      {
+        "name": "follows",
+        "domain": "Project",
+        "range": "Process",
+        "inverse": "implementedBy",
+        "cardinality": "many-to-one",
+        "description": "Indicates which process a project follows"
+      },
+      {
+        "name": "requires",
+        "domain": "Role",
+        "range": "Skill",
+        "inverse": "requiredFor",
+        "cardinality": "many-to-many",
+        "description": "Indicates which skills are required for a role"
+      },
+      {
+        "name": "possesses",
+        "domain": "Employee",
+        "range": "Skill",
+        "inverse": "possessedBy",
+        "cardinality": "many-to-many",
+        "description": "Indicates which skills an employee possesses"
+      },
+      {
+        "name": "measures",
+        "domain": "KPI",
+        "range": "Objective",
+        "inverse": "measuredBy",
+        "cardinality": "many-to-many",
+        "description": "Indicates which objectives a KPI measures"
+      },
+      {
+        "name": "affects",
+        "domain": "Risk",
+        "range": "Entity",
+        "inverse": "affectedBy",
+        "cardinality": "many-to-many",
+        "description": "Indicates which entities are affected by a risk"
+      },
+      {
+        "name": "mitigates",
+        "domain": "SecurityControl",
+        "range": "Risk",
+        "inverse": "mitigatedBy",
+        "cardinality": "many-to-many",
+        "description": "Indicates which risks are mitigated by a security control"
+      },
+      {
+        "name": "demonstrates",
+        "domain": "Employee",
+        "range": "Competency",
+        "inverse": "demonstratedBy",
+        "cardinality": "many-to-many",
+        "description": "Indicates which competencies an employee demonstrates"
+      },
+      {
+        "name": "influencedBy",
+        "domain": "Decision",
+        "range": "Stakeholder",
+        "inverse": "influences",
+        "cardinality": "many-to-many",
+        "description": "Indicates which stakeholders influence a decision"
+      },
+      {
+        "name": "implementedWith",
+        "domain": "Process",
+        "range": "Technology",
+        "inverse": "supports",
+        "cardinality": "many-to-many",
+        "description": "Indicates which technologies support a process"
+      }
+    ],
+    "instances": [
+      {
+        "id": "product1",
+        "type": "Product",
+        "properties": {
+          "name": "Enterprise Analytics Suite",
+          "version": "2.1",
+          "status": "Active"
+        },
+        "relationships": [
+          {"type": "ownedBy", "target": "dept1"},
+          {"type": "hasLifecycle", "target": "lifecycle1"},
+          {"type": "optimizedBy", "target": "feedback1"}
+        ]
+      },
+      {
+        "id": "product2",
+        "type": "Product",
+        "properties": {
+          "name": "Customer Portal",
+          "version": "1.5",
+          "status": "Active"
+        },
+        "relationships": [
+          {"type": "ownedBy", "target": "dept2"},
+          {"type": "hasLifecycle", "target": "lifecycle2"},
+          {"type": "optimizedBy", "target": "feedback2"}
+        ]
+      },
+      {
+        "id": "dept1",
+        "type": "Department",
+        "properties": {
+          "name": "Engineering",
+          "function": "Product Development"
+        },
+        "relationships": [
+          {"type": "managedBy", "target": "manager1"},
+          {"type": "owns", "target": "product1"}
+        ]
+      },
+      {
+        "id": "dept2",
+        "type": "Department",
+        "properties": {
+          "name": "Marketing",
+          "function": "Customer Engagement"
+        },
+        "relationships": [
+          {"type": "managedBy", "target": "manager2"},
+          {"type": "owns", "target": "product2"}
+        ]
+      },
+      {
+        "id": "manager1",
+        "type": "Manager",
+        "properties": {
+          "name": "Jane Smith",
+          "role": "Engineering Director",
+          "managementLevel": "Director"
+        },
+        "relationships": [
+          {"type": "oversees", "target": "employee1"},
+          {"type": "oversees", "target": "employee2"},
+          {"type": "manages", "target": "dept1"}
+        ]
+      },
+      {
+        "id": "manager2",
+        "type": "Manager",
+        "properties": {
+          "name": "Michael Chen",
+          "role": "Marketing Manager",
+          "managementLevel": "Manager"
+        },
+        "relationships": [
+          {"type": "oversees", "target": "employee3"},
+          {"type": "manages", "target": "dept2"}
+        ]
+      },
+      {
+        "id": "employee1",
+        "type": "Employee",
+        "properties": {
+          "name": "John Doe",
+          "role": "Senior Developer"
+        },
+        "relationships": [
+          {"type": "worksOn", "target": "product1"},
+          {"type": "reportsToDirect", "target": "manager1"}
+        ]
+      },
+      {
+        "id": "employee2",
+        "type": "Employee",
+        "properties": {
+          "name": "Sarah Johnson",
+          "role": "QA Engineer"
+        },
+        "relationships": [
+          {"type": "worksOn", "target": "product1"},
+          {"type": "reportsToDirect", "target": "manager1"}
+        ]
+      },
+      {
+        "id": "employee3",
+        "type": "Employee",
+        "properties": {
+          "name": "David Wilson",
+          "role": "Marketing Specialist"
+        },
+        "relationships": [
+          {"type": "worksOn", "target": "product2"},
+          {"type": "reportsToDirect", "target": "manager2"}
+        ]
+      },
+      {
+        "id": "customer1",
+        "type": "Customer",
+        "properties": {
+          "name": "Acme Corp",
+          "customerSince": "2020-05-15"
+        },
+        "relationships": [
+          {"type": "purchases", "target": "product1"},
+          {"type": "provides", "target": "feedback1"}
+        ]
+      },
+      {
+        "id": "customer2",
+        "type": "Customer",
+        "properties": {
+          "name": "GlobalTech",
+          "customerSince": "2021-03-22"
+        },
+        "relationships": [
+          {"type": "purchases", "target": "product2"},
+          {"type": "provides", "target": "feedback2"}
+        ]
+      },
+      {
+        "id": "feedback1",
+        "type": "Feedback",
+        "properties": {
+          "date": "2023-09-10",
+          "sentiment": "Positive",
+          "rating": 4.5,
+          "content": "The analytics dashboard is very intuitive and provides excellent insights.",
+          "suggestions": "Would like to see more export options."
+        },
+        "relationships": [
+          {"type": "providedBy", "target": "customer1"},
+          {"type": "pertainsTo", "target": "product1"},
+          {"type": "optimizes", "target": "product1"}
+        ]
+      },
+      {
+        "id": "feedback2",
+        "type": "Feedback",
+        "properties": {
+          "date": "2023-10-05",
+          "sentiment": "Mixed",
+          "rating": 3.0,
+          "content": "The portal is functional but navigation could be improved.",
+          "suggestions": "Add better navigation and mobile support."
+        },
+        "relationships": [
+          {"type": "providedBy", "target": "customer2"},
+          {"type": "pertainsTo", "target": "product2"},
+          {"type": "optimizes", "target": "product2"}
+        ]
+      },
+      {
+        "id": "lifecycle1",
+        "type": "Lifecycle",
+        "properties": {
+          "currentStage": "Maintenance",
+          "previousStages": ["Development", "Launch"]
+        },
+        "relationships": [
+          {"type": "lifecycleOf", "target": "product1"}
+        ]
+      },
+      {
+        "id": "lifecycle2",
+        "type": "Lifecycle",
+        "properties": {
+          "currentStage": "Growth",
+          "previousStages": ["Development", "Launch"]
+        },
+        "relationships": [
+          {"type": "lifecycleOf", "target": "product2"}
+        ]
+      },
+      {
+        "id": "platform1",
+        "type": "Platform",
+        "properties": {
+          "name": "Product Management System",
+          "version": "3.0",
+          "capabilities": ["Tracking", "Versioning", "Ownership Management"]
+        },
+        "relationships": [
+          {"type": "supports", "target": "product1"},
+          {"type": "supports", "target": "product2"}
+        ]
+      }
+    ]
+  }

data/enterprise_ontology.txt ADDED Viewed

	@@ -0,0 +1,10 @@

+Product is owned by Department.
+Department is managed by Manager.
+Employee works on Product.
+Customer purchases Product and provides Feedback.
+Platform supports Product tracking, versioning, and ownership.
+Each Product has an associated Lifecycle.
+Product Lifecycle includes stages like development, launch, maintenance, and retirement.
+Manager oversees Employee performance and departmental goals.
+Feedback includes sentiment, rating, and suggestions.
+Platform uses AI agents to optimize Product development based on Feedback trends.

huggingface.yml ADDED Viewed

	@@ -0,0 +1,8 @@

+title: Ontology RAG Demo
+colorFrom: indigo
+colorTo: blue
+sdk: streamlit
+sdk_version: 1.44.0
+app_file: app.py
+pinned: true
+license: mit

requirements.txt ADDED Viewed

	@@ -0,0 +1,14 @@

+streamlit>=1.44.0
+openai>=1.2.0
+langchain>=0.1.13
+langchain-community>=0.0.21
+langchain-openai>=0.0.5
+faiss-cpu>=1.7.4
+networkx>=3.1
+pyvis>=0.3.2
+plotly>=5.15.0
+pandas>=2.0.0
+matplotlib>=3.7.1
+numpy>=1.24.3
+pygraphviz>=1.10  # May require system dependencies, optional
+pydantic>=1.10.8

src/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ # Package initialization

src/knowledge_graph.py ADDED Viewed

	@@ -0,0 +1,920 @@

+# src/knowledge_graph.py
+import networkx as nx
+from pyvis.network import Network
+import json
+from typing import Dict, List, Any, Optional, Set, Tuple
+import matplotlib.pyplot as plt
+import matplotlib.colors as mcolors
+from collections import defaultdict
+class KnowledgeGraph:
+    """
+    Handles the construction and visualization of knowledge graphs
+    based on the ontology data.
+    """
+    def __init__(self, ontology_manager=None):
+        """
+        Initialize the knowledge graph handler.
+        Args:
+            ontology_manager: Optional ontology manager instance
+        """
+        self.ontology_manager = ontology_manager
+        self.graph = None
+        if ontology_manager:
+            self.graph = ontology_manager.graph
+    def build_visualization_graph(
+        self,
+        include_classes: bool = True,
+        include_instances: bool = True,
+        central_entity: Optional[str] = None,
+        max_distance: int = 2,
+        include_properties: bool = False
+    ) -> nx.Graph:
+        """
+        Build a simplified graph for visualization purposes.
+        Args:
+            include_classes: Whether to include class nodes
+            include_instances: Whether to include instance nodes
+            central_entity: Optional central entity to focus the graph on
+            max_distance: Maximum distance from central entity to include
+            include_properties: Whether to include property nodes
+        Returns:
+            A NetworkX graph suitable for visualization
+        """
+        if not self.graph:
+            return nx.Graph()
+        # Create an undirected graph for visualization
+        viz_graph = nx.Graph()
+        # If we have a central entity, extract a subgraph around it
+        if central_entity and central_entity in self.graph:
+            # Get nodes within max_distance of central_entity
+            nodes_to_include = set([central_entity])
+            current_distance = 0
+            current_layer = set([central_entity])
+            while current_distance < max_distance:
+                next_layer = set()
+                for node in current_layer:
+                    # Get neighbors
+                    neighbors = set(self.graph.successors(node)).union(set(self.graph.predecessors(node)))
+                    next_layer.update(neighbors)
+                nodes_to_include.update(next_layer)
+                current_layer = next_layer
+                current_distance += 1
+            # Create subgraph
+            subgraph = self.graph.subgraph(nodes_to_include)
+        else:
+            subgraph = self.graph
+        # Add nodes to the visualization graph
+        for node, data in subgraph.nodes(data=True):
+            node_type = data.get("type")
+            # Skip nodes based on configuration
+            if node_type == "class" and not include_classes:
+                continue
+            if node_type == "instance" and not include_instances:
+                continue
+            # Get readable name for the node
+            if node_type == "instance" and "properties" in data:
+                label = data["properties"].get("name", node)
+            else:
+                label = node
+            # Set node attributes for visualization
+            viz_attrs = {
+                "id": node,
+                "label": label,
+                "title": self._get_node_tooltip(node, data),
+                "group": data.get("class_type", node_type),
+                "shape": "dot" if node_type == "instance" else "diamond"
+            }
+            # Highlight central entity if specified
+            if central_entity and node == central_entity:
+                viz_attrs["color"] = "#ff7f0e"  # Orange for central entity
+                viz_attrs["size"] = 25  # Larger size for central entity
+            # Add the node
+            viz_graph.add_node(node, **viz_attrs)
+            # Add property nodes if configured
+            if include_properties and node_type == "instance" and "properties" in data:
+                for prop_name, prop_value in data["properties"].items():
+                    # Create a property node
+                    prop_node_id = f"{node}_{prop_name}"
+                    prop_value_str = str(prop_value)
+                    if len(prop_value_str) > 20:
+                        prop_value_str = prop_value_str[:17] + "..."
+                    viz_graph.add_node(
+                        prop_node_id,
+                        id=prop_node_id,
+                        label=f"{prop_name}: {prop_value_str}",
+                        title=f"{prop_name}: {prop_value}",
+                        group="property",
+                        shape="ellipse",
+                        size=5
+                    )
+                    # Connect instance to property
+                    viz_graph.add_edge(node, prop_node_id, label="has_property", dashes=True)
+        # Add edges to the visualization graph
+        for source, target, data in subgraph.edges(data=True):
+            # Only include edges between nodes that are in the viz_graph
+            if source in viz_graph and target in viz_graph:
+                # Skip property-related edges if we're manually creating them
+                if include_properties and (
+                    source.startswith(target + "_") or target.startswith(source + "_")
+                ):
+                    continue
+                # Set edge attributes
+                edge_type = data.get("type", "unknown")
+                # Don't show subClassOf and instanceOf relationships if not explicitly requested
+                if edge_type in ["subClassOf", "instanceOf"] and not include_classes:
+                    continue
+                viz_graph.add_edge(source, target, label=edge_type, title=edge_type)
+        return viz_graph
+    def _get_node_tooltip(self, node_id: str, data: Dict) -> str:
+        """Generate a tooltip for a node."""
+        tooltip = f"<strong>ID:</strong> {node_id}<br>"
+        node_type = data.get("type")
+        if node_type:
+            tooltip += f"<strong>Type:</strong> {node_type}<br>"
+        if node_type == "instance":
+            tooltip += f"<strong>Class:</strong> {data.get('class_type', 'unknown')}<br>"
+            # Add properties
+            if "properties" in data:
+                tooltip += "<strong>Properties:</strong><br>"
+                for key, value in data["properties"].items():
+                    tooltip += f"- {key}: {value}<br>"
+        elif node_type == "class":
+            tooltip += f"<strong>Description:</strong> {data.get('description', '')}<br>"
+            # Add properties if available
+            if "properties" in data:
+                tooltip += "<strong>Properties:</strong> " + ", ".join(data["properties"]) + "<br>"
+        return tooltip
+    def generate_html_visualization(
+        self,
+        include_classes: bool = True,
+        include_instances: bool = True,
+        central_entity: Optional[str] = None,
+        max_distance: int = 2,
+        include_properties: bool = False,
+        height: str = "600px",
+        width: str = "100%",
+        bgcolor: str = "#ffffff",
+        font_color: str = "#000000",
+        layout_algorithm: str = "force-directed"
+    ) -> str:
+        """
+        Generate an HTML visualization of the knowledge graph.
+        Args:
+            include_classes: Whether to include class nodes
+            include_instances: Whether to include instance nodes
+            central_entity: Optional central entity to focus the graph on
+            max_distance: Maximum distance from central entity to include
+            include_properties: Whether to include property nodes
+            height: Height of the visualization
+            width: Width of the visualization
+            bgcolor: Background color
+            font_color: Font color
+            layout_algorithm: Algorithm for layout ('force-directed', 'hierarchical', 'radial', 'circular')
+        Returns:
+            HTML string containing the visualization
+        """
+        # Build the visualization graph
+        viz_graph = self.build_visualization_graph(
+            include_classes=include_classes,
+            include_instances=include_instances,
+            central_entity=central_entity,
+            max_distance=max_distance,
+            include_properties=include_properties
+        )
+        # Create a PyVis network
+        net = Network(height=height, width=width, bgcolor=bgcolor, font_color=font_color, directed=True)
+        # Configure physics based on the selected layout algorithm
+        if layout_algorithm == "force-directed":
+            physics_options = {
+                "enabled": True,
+                "solver": "forceAtlas2Based",
+                "forceAtlas2Based": {
+                    "gravitationalConstant": -50,
+                    "centralGravity": 0.01,
+                    "springLength": 100,
+                    "springConstant": 0.08
+                },
+                "stabilization": {
+                    "enabled": True,
+                    "iterations": 100
+                }
+            }
+        elif layout_algorithm == "hierarchical":
+            physics_options = {
+                "enabled": True,
+                "hierarchicalRepulsion": {
+                    "centralGravity": 0.0,
+                    "springLength": 100,
+                    "springConstant": 0.01,
+                    "nodeDistance": 120
+                },
+                "solver": "hierarchicalRepulsion",
+                "stabilization": {
+                    "enabled": True,
+                    "iterations": 100
+                }
+            }
+            # Set hierarchical layout
+            net.set_options("""
+                var options = {
+                    "layout": {
+                        "hierarchical": {
+                            "enabled": true,
+                            "direction": "UD",
+                            "sortMethod": "directed",
+                            "nodeSpacing": 150,
+                            "treeSpacing": 200
+                        }
+                    }
+                }
+            """)
+        elif layout_algorithm == "radial":
+            physics_options = {
+                "enabled": True,
+                "solver": "repulsion",
+                "repulsion": {
+                    "nodeDistance": 120,
+                    "centralGravity": 0.2,
+                    "springLength": 200,
+                    "springConstant": 0.05
+                },
+                "stabilization": {
+                    "enabled": True,
+                    "iterations": 100
+                }
+            }
+        elif layout_algorithm == "circular":
+            physics_options = {
+                "enabled": False
+            }
+            # Compute circular layout and set fixed positions
+            pos = nx.circular_layout(viz_graph)
+            for node_id, coords in pos.items():
+                if node_id in viz_graph.nodes:
+                    x, y = coords
+                    viz_graph.nodes[node_id]['x'] = float(x) * 500
+                    viz_graph.nodes[node_id]['y'] = float(y) * 500
+                    viz_graph.nodes[node_id]['physics'] = False
+        # Configure other options
+        options = {
+            "nodes": {
+                "font": {"size": 12},
+                "scaling": {"min": 10, "max": 30}
+            },
+            "edges": {
+                "color": {"inherit": True},
+                "smooth": {"enabled": True, "type": "dynamic"},
+                "arrows": {"to": {"enabled": True, "scaleFactor": 0.5}},
+                "font": {"size": 10, "align": "middle"}
+            },
+            "physics": physics_options,
+            "interaction": {
+                "hover": True,
+                "navigationButtons": True,
+                "keyboard": True,
+                "tooltipDelay": 100
+            }
+        }
+        # Set options and create the network
+        net.options = options
+        net.from_nx(viz_graph)
+        # Add custom CSS for better visualization
+        custom_css = """
+        <style>
+          .vis-network {
+            border: 1px solid #ddd;
+            border-radius: 5px;
+          }
+          .vis-tooltip {
+            position: absolute;
+            background-color: #f5f5f5;
+            border: 1px solid #ccc;
+            border-radius: 4px;
+            padding: 10px;
+            font-family: Arial, sans-serif;
+            font-size: 12px;
+            color: #333;
+            max-width: 300px;
+            z-index: 9999;
+            box-shadow: 0 2px 4px rgba(0,0,0,0.1);
+          }
+        </style>
+        """
+        # Generate the HTML and add custom CSS
+        html = net.generate_html()
+        html = html.replace("<style>", custom_css + "<style>")
+        # Add legend
+        legend_html = self._generate_legend_html(viz_graph)
+        html = html.replace("</body>", legend_html + "</body>")
+        return html
+    def _generate_legend_html(self, graph: nx.Graph) -> str:
+        """Generate a legend for the visualization."""
+        # Collect unique groups
+        groups = set()
+        for _, attrs in graph.nodes(data=True):
+            if "group" in attrs:
+                groups.add(attrs["group"])
+        # Generate HTML for legend
+        legend_html = """
+        <div id="graph-legend" style="position: absolute; top: 10px; right: 10px; background-color: rgba(255,255,255,0.8);
+                                    padding: 10px; border-radius: 5px; border: 1px solid #ddd; max-width: 200px;">
+            <strong>Legend:</strong>
+            <ul style="list-style-type: none; padding-left: 0; margin-top: 5px;">
+        """
+        # Add items for each group
+        for group in sorted(groups):
+            color = "#97c2fc"  # Default color
+            if group == "property":
+                color = "#ffcc99"
+            elif group == "class":
+                color = "#a1d3a2"
+            legend_html += f"""
+                <li style="margin-bottom: 5px;">
+                    <span style="display: inline-block; width: 12px; height: 12px; border-radius: 50%;
+                                background-color: {color}; margin-right: 5px;"></span>
+                    {group}
+                </li>
+            """
+        # Close the legend container
+        legend_html += """
+            </ul>
+            <div style="font-size: 10px; margin-top: 5px; color: #666;">
+                Double-click to zoom, drag to pan, scroll to zoom in/out
+            </div>
+        </div>
+        """
+        return legend_html
+    def get_graph_statistics(self) -> Dict[str, Any]:
+        """
+        Calculate statistics about the knowledge graph.
+        Returns:
+            A dictionary containing graph statistics
+        """
+        if not self.graph:
+            return {}
+        # Count nodes by type
+        class_count = 0
+        instance_count = 0
+        property_count = 0
+        for _, data in self.graph.nodes(data=True):
+            node_type = data.get("type")
+            if node_type == "class":
+                class_count += 1
+            elif node_type == "instance":
+                instance_count += 1
+                if "properties" in data:
+                    property_count += len(data["properties"])
+        # Count edges by type
+        relationship_counts = {}
+        for _, _, data in self.graph.edges(data=True):
+            rel_type = data.get("type", "unknown")
+            relationship_counts[rel_type] = relationship_counts.get(rel_type, 0) + 1
+        # Calculate graph metrics
+        try:
+            # Some metrics only work on undirected graphs
+            undirected = nx.Graph(self.graph)
+            avg_degree = sum(dict(undirected.degree()).values()) / undirected.number_of_nodes()
+            # Only calculate these if the graph is connected
+            if nx.is_connected(undirected):
+                avg_path_length = nx.average_shortest_path_length(undirected)
+                diameter = nx.diameter(undirected)
+            else:
+                # Get the largest connected component
+                largest_cc = max(nx.connected_components(undirected), key=len)
+                largest_cc_subgraph = undirected.subgraph(largest_cc)
+                avg_path_length = nx.average_shortest_path_length(largest_cc_subgraph)
+                diameter = nx.diameter(largest_cc_subgraph)
+            # Calculate density
+            density = nx.density(self.graph)
+            # Calculate clustering coefficient
+            clustering = nx.average_clustering(undirected)
+        except:
+            avg_degree = 0
+            avg_path_length = 0
+            diameter = 0
+            density = 0
+            clustering = 0
+        # Count different entity types
+        class_counts = defaultdict(int)
+        for _, data in self.graph.nodes(data=True):
+            if data.get("type") == "instance":
+                class_type = data.get("class_type", "unknown")
+                class_counts[class_type] += 1
+        # Get nodes with highest centrality
+        try:
+            betweenness = nx.betweenness_centrality(self.graph)
+            degree = nx.degree_centrality(self.graph)
+            # Get top 5 nodes by betweenness centrality
+            top_betweenness = sorted(betweenness.items(), key=lambda x: x[1], reverse=True)[:5]
+            top_degree = sorted(degree.items(), key=lambda x: x[1], reverse=True)[:5]
+            central_nodes = {
+                "betweenness": [{"node": node, "centrality": round(cent, 3)} for node, cent in top_betweenness],
+                "degree": [{"node": node, "centrality": round(cent, 3)} for node, cent in top_degree]
+            }
+        except:
+            central_nodes = {}
+        return {
+            "node_count": self.graph.number_of_nodes(),
+            "edge_count": self.graph.number_of_edges(),
+            "class_count": class_count,
+            "instance_count": instance_count,
+            "property_count": property_count,
+            "relationship_counts": relationship_counts,
+            "class_instance_counts": dict(class_counts),
+            "average_degree": avg_degree,
+            "average_path_length": avg_path_length,
+            "diameter": diameter,
+            "density": density,
+            "clustering_coefficient": clustering,
+            "central_nodes": central_nodes
+        }
+    def find_paths_between_entities(
+        self,
+        source_entity: str,
+        target_entity: str,
+        max_length: int = 3
+    ) -> List[List[Dict]]:
+        """
+        Find all paths between two entities up to a maximum length.
+        Args:
+            source_entity: Starting entity ID
+            target_entity: Target entity ID
+            max_length: Maximum path length
+        Returns:
+            A list of paths, where each path is a list of edge dictionaries
+        """
+        if not self.graph or source_entity not in self.graph or target_entity not in self.graph:
+            return []
+        # Use networkx to find simple paths
+        try:
+            simple_paths = list(nx.all_simple_paths(
+                self.graph, source_entity, target_entity, cutoff=max_length
+            ))
+        except (nx.NetworkXNoPath, nx.NodeNotFound):
+            return []
+        # Convert paths to edge sequences
+        paths = []
+        for path in simple_paths:
+            edge_sequence = []
+            for i in range(len(path) - 1):
+                source = path[i]
+                target = path[i + 1]
+                # There may be multiple edges between nodes
+                edges = self.graph.get_edge_data(source, target)
+                if edges:
+                    for key, data in edges.items():
+                        edge_sequence.append({
+                            "source": source,
+                            "target": target,
+                            "type": data.get("type", "unknown")
+                        })
+            # Only include the path if it has meaningful relationships
+            # Filter out paths that only contain structural relationships like subClassOf, instanceOf
+            meaningful_relationships = [edge for edge in edge_sequence
+                                      if edge["type"] not in ["subClassOf", "instanceOf"]]
+            if meaningful_relationships:
+                paths.append(edge_sequence)
+        # Sort paths by length (shorter paths first)
+        paths.sort(key=len)
+        return paths
+    def get_entity_neighborhood(
+        self,
+        entity_id: str,
+        max_distance: int = 1,
+        include_classes: bool = True
+    ) -> Dict[str, Any]:
+        """
+        Get the neighborhood of an entity.
+        Args:
+            entity_id: The central entity ID
+            max_distance: Maximum distance from the central entity
+            include_classes: Whether to include class relationships
+        Returns:
+            A dictionary containing the neighborhood information
+        """
+        if not self.graph or entity_id not in self.graph:
+            return {}
+        # Get nodes within max_distance of entity_id using BFS
+        nodes_at_distance = {0: [entity_id]}
+        visited = set([entity_id])
+        for distance in range(1, max_distance + 1):
+            nodes_at_distance[distance] = []
+            for node in nodes_at_distance[distance - 1]:
+                # Get neighbors
+                neighbors = list(self.graph.successors(node)) + list(self.graph.predecessors(node))
+                for neighbor in neighbors:
+                    # Skip class nodes if not including classes
+                    neighbor_data = self.graph.nodes.get(neighbor, {})
+                    if not include_classes and neighbor_data.get("type") == "class":
+                        continue
+                    if neighbor not in visited:
+                        nodes_at_distance[distance].append(neighbor)
+                        visited.add(neighbor)
+        # Flatten the nodes
+        all_nodes = [node for nodes in nodes_at_distance.values() for node in nodes]
+        # Extract the subgraph
+        subgraph = self.graph.subgraph(all_nodes)
+        # Build neighbor information
+        neighbors = []
+        for node in all_nodes:
+            if node == entity_id:
+                continue
+            node_data = self.graph.nodes[node]
+            # Determine the relations to central entity
+            relations = []
+            # Check direct relationships
+            # Check if central entity is source
+            edges_out = self.graph.get_edge_data(entity_id, node)
+            if edges_out:
+                for key, data in edges_out.items():
+                    rel_type = data.get("type", "unknown")
+                    # Skip structural relationships if not including classes
+                    if not include_classes and rel_type in ["subClassOf", "instanceOf"]:
+                        continue
+                    relations.append({
+                        "type": rel_type,
+                        "direction": "outgoing"
+                    })
+            # Check if central entity is target
+            edges_in = self.graph.get_edge_data(node, entity_id)
+            if edges_in:
+                for key, data in edges_in.items():
+                    rel_type = data.get("type", "unknown")
+                    # Skip structural relationships if not including classes
+                    if not include_classes and rel_type in ["subClassOf", "instanceOf"]:
+                        continue
+                    relations.append({
+                        "type": rel_type,
+                        "direction": "incoming"
+                    })
+            # Also find paths through intermediate nodes (indirect relationships)
+            if not relations:  # Only look for indirect if no direct relationships
+                for path_length in range(2, max_distance + 1):
+                    try:
+                        # Find paths of exactly length path_length
+                        paths = list(nx.all_simple_paths(
+                            self.graph, entity_id, node, cutoff=path_length, min_edges=path_length
+                        ))
+                        for path in paths:
+                            if len(path) > 1:  # Path should have at least 2 nodes
+                                intermediate_nodes = path[1:-1]  # Skip source and target
+                                # Format the path as a relation
+                                path_relation = {
+                                    "type": "indirect_connection",
+                                    "direction": "outgoing",
+                                    "path_length": len(path) - 1,
+                                    "intermediates": intermediate_nodes
+                                }
+                                relations.append(path_relation)
+                                # Only need one example of an indirect path
+                                break
+                    except (nx.NetworkXNoPath, nx.NodeNotFound):
+                        pass
+            # Only include neighbors with relations
+            if relations:
+                neighbors.append({
+                    "id": node,
+                    "type": node_data.get("type"),
+                    "class_type": node_data.get("class_type"),
+                    "properties": node_data.get("properties", {}),
+                    "relations": relations,
+                    "distance": next(dist for dist, nodes in nodes_at_distance.items() if node in nodes)
+                })
+        # Group neighbors by distance
+        neighbors_by_distance = defaultdict(list)
+        for neighbor in neighbors:
+            neighbors_by_distance[neighbor["distance"]].append(neighbor)
+        # Get central entity info
+        central_data = self.graph.nodes[entity_id]
+        return {
+            "central_entity": {
+                "id": entity_id,
+                "type": central_data.get("type"),
+                "class_type": central_data.get("class_type", ""),
+                "properties": central_data.get("properties", {})
+            },
+            "neighbors": neighbors,
+            "neighbors_by_distance": dict(neighbors_by_distance),
+            "total_neighbors": len(neighbors)
+        }
+    def find_common_patterns(self) -> List[Dict[str, Any]]:
+        """
+        Find common patterns and structures in the knowledge graph.
+        Returns:
+            A list of pattern dictionaries
+        """
+        if not self.graph:
+            return []
+        patterns = []
+        # Find common relationship patterns
+        relationship_patterns = self._find_relationship_patterns()
+        if relationship_patterns:
+            patterns.extend(relationship_patterns)
+        # Find hub entities (entities with many connections)
+        hub_entities = self._find_hub_entities()
+        if hub_entities:
+            patterns.append({
+                "type": "hub_entities",
+                "description": "Entities with high connectivity serving as knowledge hubs",
+                "entities": hub_entities
+            })
+        # Find common property patterns
+        property_patterns = self._find_property_patterns()
+        if property_patterns:
+            patterns.extend(property_patterns)
+        return patterns
+    def _find_relationship_patterns(self) -> List[Dict[str, Any]]:
+        """Find common relationship patterns in the graph."""
+        # Count relationship triplets (source_type, relation, target_type)
+        triplet_counts = defaultdict(int)
+        for source, target, data in self.graph.edges(data=True):
+            rel_type = data.get("type", "unknown")
+            # Skip structural relationships
+            if rel_type in ["subClassOf", "instanceOf"]:
+                continue
+            # Get node types
+            source_data = self.graph.nodes[source]
+            target_data = self.graph.nodes[target]
+            source_type = (
+                source_data.get("class_type")
+                if source_data.get("type") == "instance"
+                else source_data.get("type")
+            )
+            target_type = (
+                target_data.get("class_type")
+                if target_data.get("type") == "instance"
+                else target_data.get("type")
+            )
+            if source_type and target_type:
+                triplet = (source_type, rel_type, target_type)
+                triplet_counts[triplet] += 1
+        # Get patterns with significant frequency (more than 1 occurrence)
+        patterns = []
+        for triplet, count in triplet_counts.items():
+            if count > 1:
+                source_type, rel_type, target_type = triplet
+                # Find examples of this pattern
+                examples = []
+                for source, target, data in self.graph.edges(data=True):
+                    if len(examples) >= 3:  # Limit to 3 examples
+                        break
+                    rel = data.get("type", "unknown")
+                    if rel != rel_type:
+                        continue
+                    source_data = self.graph.nodes[source]
+                    target_data = self.graph.nodes[target]
+                    current_source_type = (
+                        source_data.get("class_type")
+                        if source_data.get("type") == "instance"
+                        else source_data.get("type")
+                    )
+                    current_target_type = (
+                        target_data.get("class_type")
+                        if target_data.get("type") == "instance"
+                        else target_data.get("type")
+                    )
+                    if current_source_type == source_type and current_target_type == target_type:
+                        # Get readable names if available
+                        source_name = source
+                        if source_data.get("type") == "instance" and "properties" in source_data:
+                            properties = source_data["properties"]
+                            if "name" in properties:
+                                source_name = properties["name"]
+                        target_name = target
+                        if target_data.get("type") == "instance" and "properties" in target_data:
+                            properties = target_data["properties"]
+                            if "name" in properties:
+                                target_name = properties["name"]
+                        examples.append({
+                            "source": source,
+                            "source_name": source_name,
+                            "target": target,
+                            "target_name": target_name,
+                            "relationship": rel_type
+                        })
+                patterns.append({
+                    "type": "relationship_pattern",
+                    "description": f"{source_type} {rel_type} {target_type}",
+                    "source_type": source_type,
+                    "relationship": rel_type,
+                    "target_type": target_type,
+                    "count": count,
+                    "examples": examples
+                })
+        # Sort by frequency
+        patterns.sort(key=lambda x: x["count"], reverse=True)
+        return patterns
+    def _find_hub_entities(self) -> List[Dict[str, Any]]:
+        """Find entities that serve as hubs (many connections)."""
+        # Calculate degree centrality
+        degree = nx.degree_centrality(self.graph)
+        # Get top entities by degree
+        top_entities = sorted(degree.items(), key=lambda x: x[1], reverse=True)[:10]
+        hub_entities = []
+        for node, centrality in top_entities:
+            node_data = self.graph.nodes[node]
+            node_type = node_data.get("type")
+            # Only consider instance nodes
+            if node_type == "instance":
+                # Get class type
+                class_type = node_data.get("class_type", "unknown")
+                # Get name if available
+                name = node
+                if "properties" in node_data and "name" in node_data["properties"]:
+                    name = node_data["properties"]["name"]
+                # Count relationships by type
+                relationships = defaultdict(int)
+                for _, _, data in self.graph.edges(data=True, nbunch=[node]):
+                    rel_type = data.get("type", "unknown")
+                    if rel_type not in ["subClassOf", "instanceOf"]:
+                        relationships[rel_type] += 1
+                hub_entities.append({
+                    "id": node,
+                    "name": name,
+                    "type": class_type,
+                    "centrality": centrality,
+                    "relationships": dict(relationships),
+                    "total_connections": sum(relationships.values())
+                })
+        # Sort by total connections
+        hub_entities.sort(key=lambda x: x["total_connections"], reverse=True)
+        return hub_entities
+    def _find_property_patterns(self) -> List[Dict[str, Any]]:
+        """Find common property patterns in instance data."""
+        # Track properties by class type
+        properties_by_class = defaultdict(lambda: defaultdict(int))
+        for node, data in self.graph.nodes(data=True):
+            if data.get("type") == "instance":
+                class_type = data.get("class_type", "unknown")
+                if "properties" in data:
+                    for prop in data["properties"].keys():
+                        properties_by_class[class_type][prop] += 1
+        # Find common property combinations
+        patterns = []
+        for class_type, props in properties_by_class.items():
+            # Sort properties by frequency
+            sorted_props = sorted(props.items(), key=lambda x: x[1], reverse=True)
+            # Only include classes with multiple instances
+            class_instances = sum(1 for _, data in self.graph.nodes(data=True)
+                                if data.get("type") == "instance" and data.get("class_type") == class_type)
+            if class_instances > 1:
+                common_props = [prop for prop, count in sorted_props if count > 1]
+                if common_props:
+                    patterns.append({
+                        "type": "property_pattern",
+                        "description": f"Common properties for {class_type} instances",
+                        "class_type": class_type,
+                        "instance_count": class_instances,
+                        "common_properties": common_props,
+                        "property_frequencies": {prop: count for prop, count in sorted_props}
+                    })
+        return patterns

src/ontology_manager.py ADDED Viewed

	@@ -0,0 +1,440 @@

+# src/ontology_manager.py
+import json
+import networkx as nx
+from typing import Dict, List, Any, Optional, Union, Set
+class OntologyManager:
+    """
+    Manages the ontology model and provides methods for querying and navigating
+    the ontological structure.
+    """
+    def __init__(self, ontology_path: str):
+        """
+        Initialize the ontology manager with a path to the ontology JSON file.
+        Args:
+            ontology_path: Path to the JSON file containing the ontology model
+        """
+        self.ontology_path = ontology_path
+        self.ontology_data = self._load_ontology()
+        self.graph = self._build_graph()
+    def _load_ontology(self) -> Dict:
+        """Load the ontology from the JSON file."""
+        with open(self.ontology_path, 'r') as f:
+            return json.load(f)
+    def _build_graph(self) -> nx.MultiDiGraph:
+        """Construct a directed graph from the ontology data."""
+        G = nx.MultiDiGraph()
+        # Add class nodes
+        for class_id, class_data in self.ontology_data["classes"].items():
+            G.add_node(class_id,
+                      type="class",
+                      description=class_data.get("description", ""),
+                      properties=class_data.get("properties", []))
+            # Add subclass relationships
+            if "subClassOf" in class_data:
+                G.add_edge(class_id, class_data["subClassOf"],
+                          type="subClassOf")
+        # Add relationship type information
+        self.relationship_info = {r["name"]: r for r in self.ontology_data["relationships"]}
+        # Add instance nodes and their relationships
+        for instance in self.ontology_data["instances"]:
+            G.add_node(instance["id"],
+                      type="instance",
+                      class_type=instance["type"],
+                      properties=instance.get("properties", {}))
+            # Add instance-of-class relationship
+            G.add_edge(instance["id"], instance["type"], type="instanceOf")
+            # Add relationships between instances
+            for rel in instance.get("relationships", []):
+                G.add_edge(instance["id"], rel["target"],
+                          type=rel["type"])
+        return G
+    def get_classes(self) -> List[str]:
+        """Return a list of all class names in the ontology."""
+        return list(self.ontology_data["classes"].keys())
+    def get_class_hierarchy(self) -> Dict[str, List[str]]:
+        """Return a dictionary mapping each class to its subclasses."""
+        hierarchy = {}
+        for class_id in self.get_classes():
+            hierarchy[class_id] = []
+        for class_id, class_data in self.ontology_data["classes"].items():
+            if "subClassOf" in class_data:
+                parent = class_data["subClassOf"]
+                if parent in hierarchy:
+                    hierarchy[parent].append(class_id)
+        return hierarchy
+    def get_instances_of_class(self, class_name: str, include_subclasses: bool = True) -> List[str]:
+        """
+        Get all instances of a given class.
+        Args:
+            class_name: The name of the class
+            include_subclasses: Whether to include instances of subclasses
+        Returns:
+            A list of instance IDs
+        """
+        if include_subclasses:
+            # Get all subclasses recursively
+            subclasses = set(self._get_all_subclasses(class_name))
+            subclasses.add(class_name)
+            # Get instances of all classes
+            instances = []
+            for class_id in subclasses:
+                instances.extend([
+                    n for n, attr in self.graph.nodes(data=True)
+                    if attr.get("type") == "instance" and attr.get("class_type") == class_id
+                ])
+            return instances
+        else:
+            # Just get direct instances
+            return [
+                n for n, attr in self.graph.nodes(data=True)
+                if attr.get("type") == "instance" and attr.get("class_type") == class_name
+            ]
+    def _get_all_subclasses(self, class_name: str) -> List[str]:
+        """Recursively get all subclasses of a given class."""
+        subclasses = []
+        direct_subclasses = [
+            src for src, dst, data in self.graph.edges(data=True)
+            if dst == class_name and data.get("type") == "subClassOf"
+        ]
+        for subclass in direct_subclasses:
+            subclasses.append(subclass)
+            subclasses.extend(self._get_all_subclasses(subclass))
+        return subclasses
+    def get_relationships(self, entity_id: str, relationship_type: Optional[str] = None) -> List[Dict]:
+        """
+        Get all relationships for a given entity, optionally filtered by type.
+        Args:
+            entity_id: The ID of the entity
+            relationship_type: Optional relationship type to filter by
+        Returns:
+            A list of dictionaries containing relationship information
+        """
+        relationships = []
+        # Look at outgoing edges
+        for _, target, data in self.graph.out_edges(entity_id, data=True):
+            rel_type = data.get("type")
+            if rel_type != "instanceOf" and rel_type != "subClassOf":
+                if relationship_type is None or rel_type == relationship_type:
+                    relationships.append({
+                        "type": rel_type,
+                        "target": target,
+                        "direction": "outgoing"
+                    })
+        # Look at incoming edges
+        for source, _, data in self.graph.in_edges(entity_id, data=True):
+            rel_type = data.get("type")
+            if rel_type != "instanceOf" and rel_type != "subClassOf":
+                if relationship_type is None or rel_type == relationship_type:
+                    relationships.append({
+                        "type": rel_type,
+                        "source": source,
+                        "direction": "incoming"
+                    })
+        return relationships
+    def find_paths(self, source_id: str, target_id: str, max_length: int = 3) -> List[List[Dict]]:
+        """
+        Find all paths between two entities up to a maximum length.
+        Args:
+            source_id: Starting entity ID
+            target_id: Target entity ID
+            max_length: Maximum path length
+        Returns:
+            A list of paths, where each path is a list of relationship dictionaries
+        """
+        paths = []
+        # Use networkx to find simple paths
+        simple_paths = nx.all_simple_paths(self.graph, source_id, target_id, cutoff=max_length)
+        for path in simple_paths:
+            path_with_edges = []
+            for i in range(len(path) - 1):
+                source = path[i]
+                target = path[i + 1]
+                # There may be multiple edges between nodes
+                edges = self.graph.get_edge_data(source, target)
+                if edges:
+                    for key, data in edges.items():
+                        path_with_edges.append({
+                            "source": source,
+                            "target": target,
+                            "type": data.get("type", "unknown")
+                        })
+            paths.append(path_with_edges)
+        return paths
+    def get_entity_info(self, entity_id: str) -> Dict:
+        """
+        Get detailed information about an entity.
+        Args:
+            entity_id: The ID of the entity
+        Returns:
+            A dictionary with entity information
+        """
+        if entity_id not in self.graph:
+            return {}
+        node_data = self.graph.nodes[entity_id]
+        entity_type = node_data.get("type")
+        if entity_type == "instance":
+            # Get class information
+            class_type = node_data.get("class_type")
+            class_info = self.ontology_data["classes"].get(class_type, {})
+            return {
+                "id": entity_id,
+                "type": entity_type,
+                "class": class_type,
+                "class_description": class_info.get("description", ""),
+                "properties": node_data.get("properties", {}),
+                "relationships": self.get_relationships(entity_id)
+            }
+        elif entity_type == "class":
+            return {
+                "id": entity_id,
+                "type": entity_type,
+                "description": node_data.get("description", ""),
+                "properties": node_data.get("properties", []),
+                "subclasses": self._get_all_subclasses(entity_id),
+                "instances": self.get_instances_of_class(entity_id)
+            }
+        return node_data
+    def get_text_representation(self) -> str:
+        """
+        Generate a text representation of the ontology for embedding.
+        Returns:
+            A string containing the textual representation of the ontology
+        """
+        text_chunks = []
+        # Class definitions
+        for class_id, class_data in self.ontology_data["classes"].items():
+            chunk = f"Class: {class_id}\n"
+            chunk += f"Description: {class_data.get('description', '')}\n"
+            if "subClassOf" in class_data:
+                chunk += f"{class_id} is a subclass of {class_data['subClassOf']}.\n"
+            if "properties" in class_data:
+                chunk += f"{class_id} has properties: {', '.join(class_data['properties'])}.\n"
+            text_chunks.append(chunk)
+        # Relationship definitions
+        for rel in self.ontology_data["relationships"]:
+            chunk = f"Relationship: {rel['name']}\n"
+            chunk += f"Domain: {rel['domain']}, Range: {rel['range']}\n"
+            chunk += f"Description: {rel.get('description', '')}\n"
+            chunk += f"Cardinality: {rel.get('cardinality', 'many-to-many')}\n"
+            if "inverse" in rel:
+                chunk += f"The inverse relationship is {rel['inverse']}.\n"
+            text_chunks.append(chunk)
+        # Rules
+        for rule in self.ontology_data.get("rules", []):
+            chunk = f"Rule: {rule.get('id', '')}\n"
+            chunk += f"Description: {rule.get('description', '')}\n"
+            text_chunks.append(chunk)
+        # Instance data
+        for instance in self.ontology_data["instances"]:
+            chunk = f"Instance: {instance['id']}\n"
+            chunk += f"Type: {instance['type']}\n"
+            # Properties
+            if "properties" in instance:
+                props = []
+                for key, value in instance["properties"].items():
+                    if isinstance(value, list):
+                        props.append(f"{key}: {', '.join(str(v) for v in value)}")
+                    else:
+                        props.append(f"{key}: {value}")
+                if props:
+                    chunk += "Properties:\n- " + "\n- ".join(props) + "\n"
+            # Relationships
+            if "relationships" in instance:
+                rels = []
+                for rel in instance["relationships"]:
+                    rels.append(f"{rel['type']} {rel['target']}")
+                if rels:
+                    chunk += "Relationships:\n- " + "\n- ".join(rels) + "\n"
+            text_chunks.append(chunk)
+        return "\n\n".join(text_chunks)
+    def query_by_relationship(self, source_type: str, relationship: str, target_type: str) -> List[Dict]:
+        """
+        Query for instances connected by a specific relationship.
+        Args:
+            source_type: Type of the source entity
+            relationship: Type of relationship
+            target_type: Type of the target entity
+        Returns:
+            A list of matching relationship dictionaries
+        """
+        results = []
+        # Get all instances of the source type
+        source_instances = self.get_instances_of_class(source_type)
+        for source_id in source_instances:
+            # Get relationships of the specified type
+            relationships = self.get_relationships(source_id, relationship)
+            for rel in relationships:
+                if rel["direction"] == "outgoing" and "target" in rel:
+                    target_id = rel["target"]
+                    target_data = self.graph.nodes[target_id]
+                    # Check if the target is of the right type
+                    if (target_data.get("type") == "instance" and
+                        target_data.get("class_type") == target_type):
+                        results.append({
+                            "source": source_id,
+                            "source_properties": self.graph.nodes[source_id].get("properties", {}),
+                            "relationship": relationship,
+                            "target": target_id,
+                            "target_properties": target_data.get("properties", {})
+                        })
+        return results
+    def get_semantic_context(self, query: str) -> List[str]:
+        """
+        Retrieve relevant semantic context from the ontology based on a query.
+        This method identifies entities and relationships mentioned in the query
+        and returns contextual information about them from the ontology.
+        Args:
+            query: The query string to analyze
+        Returns:
+            A list of text chunks providing relevant ontological context
+        """
+        # This is a simple implementation - a more sophisticated one would use
+        # entity recognition and semantic parsing
+        query_lower = query.lower()
+        context_chunks = []
+        # Check for class mentions
+        for class_id in self.get_classes():
+            if class_id.lower() in query_lower:
+                # Add class information
+                class_data = self.ontology_data["classes"][class_id]
+                chunk = f"Class {class_id}: {class_data.get('description', '')}\n"
+                # Add subclass information
+                if "subClassOf" in class_data:
+                    parent = class_data["subClassOf"]
+                    chunk += f"{class_id} is a subclass of {parent}.\n"
+                # Add property information
+                if "properties" in class_data:
+                    chunk += f"{class_id} has properties: {', '.join(class_data['properties'])}.\n"
+                context_chunks.append(chunk)
+                # Also add some instance examples
+                instances = self.get_instances_of_class(class_id, include_subclasses=False)[:3]
+                if instances:
+                    instance_chunk = f"Examples of {class_id}:\n"
+                    for inst_id in instances:
+                        props = self.graph.nodes[inst_id].get("properties", {})
+                        if "name" in props:
+                            instance_chunk += f"- {inst_id} ({props['name']})\n"
+                        else:
+                            instance_chunk += f"- {inst_id}\n"
+                    context_chunks.append(instance_chunk)
+        # Check for relationship mentions
+        for rel in self.ontology_data["relationships"]:
+            if rel["name"].lower() in query_lower:
+                chunk = f"Relationship {rel['name']}: {rel.get('description', '')}\n"
+                chunk += f"This relationship connects {rel['domain']} to {rel['range']}.\n"
+                # Add examples
+                examples = self.query_by_relationship(rel['domain'], rel['name'], rel['range'])[:3]
+                if examples:
+                    chunk += "Examples:\n"
+                    for ex in examples:
+                        source_props = ex["source_properties"]
+                        target_props = ex["target_properties"]
+                        source_name = source_props.get("name", ex["source"])
+                        target_name = target_props.get("name", ex["target"])
+                        chunk += f"- {source_name} {rel['name']} {target_name}\n"
+                context_chunks.append(chunk)
+        # If we found nothing specific, add general ontology info
+        if not context_chunks:
+            # Add information about top-level classes
+            top_classes = [c for c, data in self.ontology_data["classes"].items()
+                          if "subClassOf" not in data or data["subClassOf"] == "Entity"]
+            if top_classes:
+                chunk = "Main classes in the ontology:\n"
+                for cls in top_classes:
+                    desc = self.ontology_data["classes"][cls].get("description", "")
+                    chunk += f"- {cls}: {desc}\n"
+                context_chunks.append(chunk)
+            # Add information about key relationships
+            if self.ontology_data["relationships"]:
+                chunk = "Key relationships in the ontology:\n"
+                for rel in self.ontology_data["relationships"][:5]:  # Top 5 relationships
+                    chunk += f"- {rel['name']}: {rel.get('description', '')}\n"
+                context_chunks.append(chunk)
+        return context_chunks

src/semantic_retriever.py ADDED Viewed

	@@ -0,0 +1,233 @@

+# src/semantic_retriever.py
+from typing import List, Dict, Any, Tuple, Optional
+import numpy as np
+from langchain_community.embeddings import OpenAIEmbeddings
+from langchain_community.vectorstores import FAISS
+from langchain.schema import Document
+from src.ontology_manager import OntologyManager
+class SemanticRetriever:
+    """
+    Enhanced retrieval system that combines vector search with ontology awareness.
+    """
+    def __init__(
+        self,
+        ontology_manager: OntologyManager,
+        embeddings_model = None,
+        text_chunks: Optional[List[str]] = None
+    ):
+        """
+        Initialize the semantic retriever.
+        Args:
+            ontology_manager: The ontology manager instance
+            embeddings_model: The embeddings model to use (defaults to OpenAIEmbeddings)
+            text_chunks: Optional list of text chunks to add to the vector store
+        """
+        self.ontology_manager = ontology_manager
+        self.embeddings = embeddings_model or OpenAIEmbeddings()
+        # Create a vector store with the text representation of the ontology
+        ontology_text = ontology_manager.get_text_representation()
+        self.ontology_chunks = self._split_text(ontology_text)
+        # Add additional text chunks if provided
+        if text_chunks:
+            self.text_chunks = text_chunks
+            all_chunks = self.ontology_chunks + text_chunks
+        else:
+            self.text_chunks = []
+            all_chunks = self.ontology_chunks
+        # Convert to Document objects for FAISS
+        documents = [Document(page_content=chunk, metadata={"source": "ontology" if i < len(self.ontology_chunks) else "text"})
+                    for i, chunk in enumerate(all_chunks)]
+        # Create the vector store
+        self.vector_store = FAISS.from_documents(documents, self.embeddings)
+    def _split_text(self, text: str, chunk_size: int = 500, overlap: int = 50) -> List[str]:
+        """Split text into chunks for embedding."""
+        chunks = []
+        text_length = len(text)
+        for i in range(0, text_length, chunk_size - overlap):
+            chunk = text[i:i + chunk_size]
+            if len(chunk) < 50:  # Skip very small chunks
+                continue
+            chunks.append(chunk)
+        return chunks
+    def retrieve(self, query: str, k: int = 4, include_ontology_context: bool = True) -> List[Document]:
+        """
+        Retrieve relevant documents using a hybrid approach.
+        Args:
+            query: The query string
+            k: Number of documents to retrieve
+            include_ontology_context: Whether to include additional ontology context
+        Returns:
+            A list of retrieved documents
+        """
+        # Get semantic context from the ontology
+        if include_ontology_context:
+            ontology_context = self.ontology_manager.get_semantic_context(query)
+        else:
+            ontology_context = []
+        # Perform vector similarity search
+        vector_results = self.vector_store.similarity_search(query, k=k)
+        # Combine results
+        combined_results = vector_results
+        # Add ontology context as additional documents
+        for i, context in enumerate(ontology_context):
+            combined_results.append(Document(
+                page_content=context,
+                metadata={"source": "ontology_context", "context_id": i}
+            ))
+        return combined_results
+    def retrieve_with_paths(self, query: str, k: int = 4) -> Dict[str, Any]:
+        """
+        Enhanced retrieval that includes semantic paths between entities.
+        Args:
+            query: The query string
+            k: Number of documents to retrieve
+        Returns:
+            A dictionary containing retrieved documents and semantic paths
+        """
+        # Basic retrieval
+        basic_results = self.retrieve(query, k)
+        # Extract potential entities from the query (simplified approach)
+        # A more sophisticated approach would use NER or entity linking
+        entity_types = ["Product", "Department", "Employee", "Manager", "Customer", "Feedback"]
+        query_words = query.lower().split()
+        potential_entities = []
+        for entity_type in entity_types:
+            if entity_type.lower() in query_words:
+                # Get instances of this type
+                instances = self.ontology_manager.get_instances_of_class(entity_type)
+                if instances:
+                    # Just take the first few for demonstration
+                    potential_entities.extend(instances[:2])
+        # Find paths between potential entities
+        paths = []
+        if len(potential_entities) >= 2:
+            for i in range(len(potential_entities)):
+                for j in range(i+1, len(potential_entities)):
+                    source = potential_entities[i]
+                    target = potential_entities[j]
+                    # Find paths between these entities
+                    entity_paths = self.ontology_manager.find_paths(source, target, max_length=3)
+                    if entity_paths:
+                        for path in entity_paths:
+                            # Convert path to text
+                            path_text = self._path_to_text(path)
+                            paths.append({
+                                "source": source,
+                                "target": target,
+                                "path": path,
+                                "text": path_text
+                            })
+        # Convert paths to documents
+        path_documents = []
+        for i, path_info in enumerate(paths):
+            path_documents.append(Document(
+                page_content=path_info["text"],
+                metadata={
+                    "source": "semantic_path",
+                    "path_id": i,
+                    "source_entity": path_info["source"],
+                    "target_entity": path_info["target"]
+                }
+            ))
+        return {
+            "documents": basic_results + path_documents,
+            "paths": paths
+        }
+    def _path_to_text(self, path: List[Dict]) -> str:
+        """Convert a path to a text description."""
+        if not path:
+            return ""
+        text_parts = []
+        for edge in path:
+            source = edge["source"]
+            target = edge["target"]
+            relation = edge["type"]
+            # Get entity information
+            source_info = self.ontology_manager.get_entity_info(source)
+            target_info = self.ontology_manager.get_entity_info(target)
+            # Get names if available
+            source_name = source
+            if "properties" in source_info and "name" in source_info["properties"]:
+                source_name = source_info["properties"]["name"]
+            target_name = target
+            if "properties" in target_info and "name" in target_info["properties"]:
+                target_name = target_info["properties"]["name"]
+            # Describe the relationship
+            text_parts.append(f"{source_name} {relation} {target_name}")
+        return " -> ".join(text_parts)
+    def search_by_property(self, class_type: str, property_name: str, property_value: str) -> List[Document]:
+        """
+        Search for instances of a class with a specific property value.
+        Args:
+            class_type: The class to search in
+            property_name: The property name to match
+            property_value: The property value to match
+        Returns:
+            A list of matched entities as documents
+        """
+        instances = self.ontology_manager.get_instances_of_class(class_type)
+        results = []
+        for instance_id in instances:
+            entity_info = self.ontology_manager.get_entity_info(instance_id)
+            if "properties" in entity_info:
+                properties = entity_info["properties"]
+                if property_name in properties:
+                    # Simple string matching (could be enhanced with fuzzy matching)
+                    if str(properties[property_name]).lower() == property_value.lower():
+                        # Convert to document
+                        doc_content = f"Instance: {instance_id}\n"
+                        doc_content += f"Type: {class_type}\n"
+                        doc_content += "Properties:\n"
+                        for prop_name, prop_value in properties.items():
+                            doc_content += f"- {prop_name}: {prop_value}\n"
+                        results.append(Document(
+                            page_content=doc_content,
+                            metadata={
+                                "source": "property_search",
+                                "instance_id": instance_id,
+                                "class_type": class_type
+                            }
+                        ))
+        return results

src/visualization.py ADDED Viewed

	@@ -0,0 +1,1564 @@

+# src/visualization.py
+import streamlit as st
+import json
+import networkx as nx
+import pandas as pd
+from typing import Dict, List, Any, Optional, Set, Tuple
+import plotly.graph_objects as go
+import plotly.express as px
+import matplotlib.pyplot as plt
+import matplotlib.colors as mcolors
+from collections import defaultdict
+import math
+def render_html_in_streamlit(html_content: str):
+    """Display HTML content in Streamlit using an iframe."""
+    import base64
+    # Encode the HTML content
+    encoded_html = base64.b64encode(html_content.encode()).decode()
+    # Create an iframe with the data URL
+    iframe_html = f"""
+        <iframe
+            srcdoc="{encoded_html}"
+            width="100%"
+            height="600px"
+            frameborder="0"
+            allowfullscreen>
+        </iframe>
+    """
+    # Display the iframe
+    st.markdown(iframe_html, unsafe_allow_html=True)
+def display_ontology_stats(ontology_manager):
+    """Display statistics and visualizations about the ontology."""
+    st.subheader("📊 Ontology Structure and Statistics")
+    # Get basic stats
+    classes = ontology_manager.get_classes()
+    class_hierarchy = ontology_manager.get_class_hierarchy()
+    # Count instances per class
+    class_counts = []
+    for class_name in classes:
+        instance_count = len(ontology_manager.get_instances_of_class(class_name, include_subclasses=False))
+        class_counts.append({
+            "Class": class_name,
+            "Instances": instance_count
+        })
+    # Display summary metrics
+    col1, col2, col3 = st.columns(3)
+    with col1:
+        st.metric("Total Classes", len(classes))
+    # Count total instances
+    total_instances = sum(item["Instances"] for item in class_counts)
+    with col2:
+        st.metric("Total Instances", total_instances)
+    # Count relationships
+    relationship_count = len(ontology_manager.ontology_data.get("relationships", []))
+    with col3:
+        st.metric("Relationship Types", relationship_count)
+    # Visualize class hierarchy
+    st.markdown("### Class Hierarchy")
+    # Create tabs for different views
+    tab1, tab2, tab3 = st.tabs(["Tree View", "Class Statistics", "Hierarchy Graph"])
+    with tab1:
+        # Create a collapsible tree view of class hierarchy
+        display_class_hierarchy_tree(ontology_manager, class_hierarchy)
+    with tab2:
+        # Display class stats and distribution
+        if class_counts:
+            # Filter to only show classes with instances
+            non_empty_classes = [item for item in class_counts if item["Instances"] > 0]
+            if non_empty_classes:
+                df = pd.DataFrame(non_empty_classes)
+                df = df.sort_values("Instances", ascending=False)
+                # Create horizontal bar chart
+                fig = px.bar(df,
+                             x="Instances",
+                             y="Class",
+                             orientation='h',
+                             title="Instances per Class",
+                             color="Instances",
+                             color_continuous_scale="viridis")
+                fig.update_layout(yaxis={'categoryorder':'total ascending'})
+                st.plotly_chart(fig, use_container_width=True)
+            else:
+                st.info("No classes with instances found.")
+        # Show distribution of classes by inheritance depth
+        display_class_depth_distribution(ontology_manager)
+    with tab3:
+        # Display class hierarchy as a graph
+        display_class_hierarchy_graph(ontology_manager)
+    # Relationship statistics
+    st.markdown("### Relationship Analysis")
+    # Get relationship usage statistics
+    relationship_usage = analyze_relationship_usage(ontology_manager)
+    # Display relationship usage in a table and chart
+    if relationship_usage:
+        tab1, tab2 = st.tabs(["Usage Statistics", "Domain/Range Distribution"])
+        with tab1:
+            # Create DataFrame for the table
+            df = pd.DataFrame(relationship_usage)
+            df = df.sort_values("Usage Count", ascending=False)
+            # Show table
+            st.dataframe(df)
+            # Create bar chart for relationship usage
+            fig = px.bar(df,
+                        x="Relationship",
+                        y="Usage Count",
+                        title="Relationship Usage Frequency",
+                        color="Usage Count",
+                        color_continuous_scale="blues")
+            st.plotly_chart(fig, use_container_width=True)
+        with tab2:
+            # Display domain-range distribution
+            display_domain_range_distribution(ontology_manager)
+def display_class_hierarchy_tree(ontology_manager, class_hierarchy):
+    """Display class hierarchy as an interactive tree."""
+    # Find root classes (those that aren't subclasses of anything else)
+    all_subclasses = set()
+    for subclasses in class_hierarchy.values():
+        all_subclasses.update(subclasses)
+    root_classes = [cls for cls in ontology_manager.get_classes() if cls not in all_subclasses]
+    # Create a recursive function to display the hierarchy
+    def display_subclasses(class_name, indent=0):
+        # Get class info
+        class_info = ontology_manager.ontology_data["classes"].get(class_name, {})
+        description = class_info.get("description", "")
+        instance_count = len(ontology_manager.get_instances_of_class(class_name, include_subclasses=False))
+        # Display class with expander for subclasses
+        if indent == 0:
+            # Root level classes are always expanded
+            with st.expander(f"📁 {class_name} ({instance_count} instances)", expanded=True):
+                st.markdown(f"**Description:** {description}")
+                # Show properties if any
+                properties = class_info.get("properties", [])
+                if properties:
+                    st.markdown("**Properties:**")
+                    st.markdown(", ".join(properties))
+                # Display subclasses
+                subclasses = class_hierarchy.get(class_name, [])
+                if subclasses:
+                    st.markdown("**Subclasses:**")
+                    for subclass in sorted(subclasses):
+                        display_subclasses(subclass, indent + 1)
+                else:
+                    st.markdown("*No subclasses*")
+        else:
+            # Nested classes use indentation and only show direct instances
+            if instance_count > 0:
+                class_label = f"📁 {class_name} ({instance_count} instances)"
+            else:
+                class_label = f"📁 {class_name}"
+            with st.expander(class_label, expanded=False):
+                st.markdown(f"**Description:** {description}")
+                # Show properties if any
+                properties = class_info.get("properties", [])
+                if properties:
+                    st.markdown("**Properties:**")
+                    st.markdown(", ".join(properties))
+                # Display subclasses
+                subclasses = class_hierarchy.get(class_name, [])
+                if subclasses:
+                    st.markdown("**Subclasses:**")
+                    for subclass in sorted(subclasses):
+                        display_subclasses(subclass, indent + 1)
+                else:
+                    st.markdown("*No subclasses*")
+    # Display each root class
+    for root_class in sorted(root_classes):
+        display_subclasses(root_class)
+def get_class_depths(ontology_manager) -> Dict[str, int]:
+    """Calculate the inheritance depth of each class."""
+    depths = {}
+    class_data = ontology_manager.ontology_data["classes"]
+    def get_depth(class_name):
+        # If we've already calculated the depth, return it
+        if class_name in depths:
+            return depths[class_name]
+        # Get the class data
+        cls = class_data.get(class_name, {})
+        # If no parent, depth is 0
+        if "subClassOf" not in cls:
+            depths[class_name] = 0
+            return 0
+        # Otherwise, depth is 1 + parent's depth
+        parent = cls["subClassOf"]
+        parent_depth = get_depth(parent)
+        depths[class_name] = parent_depth + 1
+        return depths[class_name]
+    # Calculate depths for all classes
+    for class_name in class_data:
+        get_depth(class_name)
+    return depths
+def display_class_depth_distribution(ontology_manager):
+    """Display distribution of classes by inheritance depth."""
+    depths = get_class_depths(ontology_manager)
+    # Count classes at each depth
+    depth_counts = defaultdict(int)
+    for _, depth in depths.items():
+        depth_counts[depth] += 1
+    # Create dataframe
+    df = pd.DataFrame([
+        {"Depth": depth, "Count": count}
+        for depth, count in depth_counts.items()
+    ])
+    if not df.empty:
+        df = df.sort_values("Depth")
+        # Create bar chart
+        fig = px.bar(df,
+                    x="Depth",
+                    y="Count",
+                    title="Class Distribution by Inheritance Depth",
+                    labels={"Depth": "Inheritance Depth", "Count": "Number of Classes"},
+                    color="Count",
+                    text="Count")
+        fig.update_traces(texttemplate='%{text}', textposition='outside')
+        fig.update_layout(uniformtext_minsize=8, uniformtext_mode='hide')
+        st.plotly_chart(fig, use_container_width=True)
+def display_class_hierarchy_graph(ontology_manager):
+    """Display class hierarchy as a directed graph."""
+    # Create a directed graph
+    G = nx.DiGraph()
+    # Add nodes for each class
+    for class_name, class_info in ontology_manager.ontology_data["classes"].items():
+        # Count direct instances
+        instance_count = len(ontology_manager.get_instances_of_class(class_name, include_subclasses=False))
+        # Add node with attributes
+        G.add_node(class_name,
+                  type="class",
+                  description=class_info.get("description", ""),
+                  instance_count=instance_count)
+        # Add edge for subclass relationship
+        if "subClassOf" in class_info:
+            parent = class_info["subClassOf"]
+            G.add_edge(parent, class_name, relationship="subClassOf")
+    # Create a Plotly graph visualization
+    # Calculate node positions using a hierarchical layout
+    pos = nx.nx_agraph.graphviz_layout(G, prog="dot")
+    # Convert positions to lists for Plotly
+    node_x = []
+    node_y = []
+    node_text = []
+    node_size = []
+    node_color = []
+    for node in G.nodes():
+        x, y = pos[node]
+        node_x.append(x)
+        node_y.append(y)
+        # Get node info for hover text
+        description = G.nodes[node].get("description", "")
+        instance_count = G.nodes[node].get("instance_count", 0)
+        # Prepare hover text
+        hover_text = f"Class: {node}<br>Description: {description}<br>Instances: {instance_count}"
+        node_text.append(hover_text)
+        # Size nodes by instance count (with a minimum size)
+        size = 10 + (instance_count * 2)
+        size = min(40, max(15, size))  # Limit size range
+        node_size.append(size)
+        # Color nodes by depth
+        depth = get_class_depths(ontology_manager).get(node, 0)
+        # Use a color scale from light to dark blue
+        node_color.append(depth)
+    # Create edge traces
+    edge_x = []
+    edge_y = []
+    for edge in G.edges():
+        x0, y0 = pos[edge[0]]
+        x1, y1 = pos[edge[1]]
+        # Add a curved line with multiple points
+        edge_x.append(x0)
+        edge_x.append(x1)
+        edge_x.append(None)  # Add None to create a break between edges
+        edge_y.append(y0)
+        edge_y.append(y1)
+        edge_y.append(None)
+    # Create node trace
+    node_trace = go.Scatter(
+        x=node_x, y=node_y,
+        mode='markers+text',
+        text=[node for node in G.nodes()],
+        textposition="bottom center",
+        hoverinfo='text',
+        hovertext=node_text,
+        marker=dict(
+            showscale=True,
+            colorscale='Blues',
+            color=node_color,
+            size=node_size,
+            line=dict(width=2, color='DarkSlateGrey'),
+            colorbar=dict(
+                title="Depth",
+                thickness=15,
+                tickvals=[0, max(node_color)],
+                ticktext=["Root", f"Depth {max(node_color)}"]
+            )
+        )
+    )
+    # Create edge trace
+    edge_trace = go.Scatter(
+        x=edge_x, y=edge_y,
+        line=dict(width=1, color='#888'),
+        hoverinfo='none',
+        mode='lines'
+    )
+    # Create figure
+    fig = go.Figure(data=[edge_trace, node_trace],
+                   layout=go.Layout(
+                       showlegend=False,
+                       hovermode='closest',
+                       margin=dict(b=20, l=5, r=5, t=40),
+                       xaxis=dict(showgrid=False, zeroline=False, showticklabels=False),
+                       yaxis=dict(showgrid=False, zeroline=False, showticklabels=False),
+                       title="Class Hierarchy Graph",
+                       title_x=0.5
+                   ))
+    # Display the figure
+    st.plotly_chart(fig, use_container_width=True)
+def analyze_relationship_usage(ontology_manager) -> List[Dict]:
+    """Analyze how relationships are used in the ontology."""
+    relationship_data = ontology_manager.ontology_data.get("relationships", [])
+    instances = ontology_manager.ontology_data.get("instances", [])
+    # Initialize counters
+    usage_counts = defaultdict(int)
+    # Count relationship usage in instances
+    for instance in instances:
+        for rel in instance.get("relationships", []):
+            usage_counts[rel["type"]] += 1
+    # Prepare results
+    results = []
+    for rel in relationship_data:
+        rel_name = rel["name"]
+        domain = rel["domain"]
+        range_class = rel["range"]
+        cardinality = rel.get("cardinality", "many-to-many")
+        count = usage_counts.get(rel_name, 0)
+        results.append({
+            "Relationship": rel_name,
+            "Domain": domain,
+            "Range": range_class,
+            "Cardinality": cardinality,
+            "Usage Count": count
+        })
+    return results
+def display_domain_range_distribution(ontology_manager):
+    """Display domain and range distribution for relationships."""
+    relationship_data = ontology_manager.ontology_data.get("relationships", [])
+    # Count domains and ranges
+    domain_counts = defaultdict(int)
+    range_counts = defaultdict(int)
+    for rel in relationship_data:
+        domain_counts[rel["domain"]] += 1
+        range_counts[rel["range"]] += 1
+    # Create DataFrames
+    domain_df = pd.DataFrame([
+        {"Class": cls, "Count": count, "Type": "Domain"}
+        for cls, count in domain_counts.items()
+    ])
+    range_df = pd.DataFrame([
+        {"Class": cls, "Count": count, "Type": "Range"}
+        for cls, count in range_counts.items()
+    ])
+    # Combine
+    combined_df = pd.concat([domain_df, range_df])
+    # Create plot
+    if not combined_df.empty:
+        fig = px.bar(combined_df,
+                    x="Class",
+                    y="Count",
+                    color="Type",
+                    barmode="group",
+                    title="Classes as Domain vs Range in Relationships",
+                    color_discrete_map={"Domain": "#1f77b4", "Range": "#ff7f0e"})
+        fig.update_layout(xaxis={'categoryorder':'total descending'})
+        st.plotly_chart(fig, use_container_width=True)
+def display_entity_details(entity_info: Dict[str, Any], ontology_manager):
+    """Display detailed information about an entity."""
+    if not entity_info:
+        st.warning("Entity not found.")
+        return
+    st.subheader(f"📝 Entity: {entity_info['id']}")
+    # Determine entity type and get class hierarchy
+    entity_type = entity_info.get("type", "")
+    class_type = entity_info.get("class", entity_info.get("class_type", ""))
+    class_hierarchy = []
+    if class_type:
+        current_class = class_type
+        while current_class:
+            class_hierarchy.append(current_class)
+            parent_class = ontology_manager.ontology_data["classes"].get(current_class, {}).get("subClassOf", "")
+            if not parent_class or parent_class == current_class:  # Prevent infinite loops
+                break
+            current_class = parent_class
+    # Display entity metadata
+    col1, col2 = st.columns([1, 2])
+    with col1:
+        st.markdown("### Basic Information")
+        # Basic info metrics
+        st.metric("Entity Type", entity_type)
+        if class_type:
+            st.metric("Class", class_type)
+        # Display class hierarchy
+        if class_hierarchy and len(class_hierarchy) > 1:
+            st.markdown("**Class Hierarchy:**")
+            hierarchy_str = " → ".join(reversed(class_hierarchy))
+            st.markdown(f"```\n{hierarchy_str}\n```")
+    with col2:
+        # Display class description if available
+        if "class_description" in entity_info:
+            st.markdown("### Description")
+            st.markdown(entity_info.get("class_description", "No description available."))
+    # Properties
+    if "properties" in entity_info and entity_info["properties"]:
+        st.markdown("### Properties")
+        # Create a more structured property display
+        properties = []
+        for key, value in entity_info["properties"].items():
+            # Handle different value types
+            if isinstance(value, list):
+                value_str = ", ".join(str(v) for v in value)
+            else:
+                value_str = str(value)
+            properties.append({"Property": key, "Value": value_str})
+        # Display as table with highlighting
+        property_df = pd.DataFrame(properties)
+        st.dataframe(
+            property_df,
+            column_config={
+                "Property": st.column_config.TextColumn("Property", width="medium"),
+                "Value": st.column_config.TextColumn("Value", width="large")
+            },
+            hide_index=True
+        )
+    # Relationships with visual enhancements
+    if "relationships" in entity_info and entity_info["relationships"]:
+        st.markdown("### Relationships")
+        # Group relationships by direction
+        outgoing = []
+        incoming = []
+        for rel in entity_info["relationships"]:
+            if "direction" in rel and rel["direction"] == "outgoing":
+                outgoing.append({
+                    "Relationship": rel["type"],
+                    "Direction": "→",
+                    "Related Entity": rel["target"]
+                })
+            elif "direction" in rel and rel["direction"] == "incoming":
+                incoming.append({
+                    "Relationship": rel["type"],
+                    "Direction": "←",
+                    "Related Entity": rel["source"]
+                })
+        # Create tabs for outgoing and incoming
+        if outgoing or incoming:
+            tab1, tab2 = st.tabs(["Outgoing Relationships", "Incoming Relationships"])
+            with tab1:
+                if outgoing:
+                    st.dataframe(
+                        pd.DataFrame(outgoing),
+                        column_config={
+                            "Relationship": st.column_config.TextColumn("Relationship Type", width="medium"),
+                            "Direction": st.column_config.TextColumn("Direction", width="small"),
+                            "Related Entity": st.column_config.TextColumn("Target Entity", width="medium")
+                        },
+                        hide_index=True
+                    )
+                else:
+                    st.info("No outgoing relationships.")
+            with tab2:
+                if incoming:
+                    st.dataframe(
+                        pd.DataFrame(incoming),
+                        column_config={
+                            "Relationship": st.column_config.TextColumn("Relationship Type", width="medium"),
+                            "Direction": st.column_config.TextColumn("Direction", width="small"),
+                            "Related Entity": st.column_config.TextColumn("Source Entity", width="medium")
+                        },
+                        hide_index=True
+                    )
+                else:
+                    st.info("No incoming relationships.")
+        # Visual relationship graph
+        st.markdown("#### Relationship Graph")
+        display_entity_relationship_graph(entity_info, ontology_manager)
+def display_entity_relationship_graph(entity_info: Dict[str, Any], ontology_manager):
+    """Display a graph of an entity's relationships."""
+    entity_id = entity_info["id"]
+    # Create graph
+    G = nx.DiGraph()
+    # Add central entity
+    G.add_node(entity_id, type="central")
+    # Add related entities and relationships
+    for rel in entity_info.get("relationships", []):
+        if "direction" in rel and rel["direction"] == "outgoing":
+            target = rel["target"]
+            rel_type = rel["type"]
+            # Add target node if not exists
+            if target not in G:
+                target_info = ontology_manager.get_entity_info(target)
+                node_type = target_info.get("type", "unknown")
+                G.add_node(target, type=node_type)
+            # Add edge
+            G.add_edge(entity_id, target, type=rel_type)
+        elif "direction" in rel and rel["direction"] == "incoming":
+            source = rel["source"]
+            rel_type = rel["type"]
+            # Add source node if not exists
+            if source not in G:
+                source_info = ontology_manager.get_entity_info(source)
+                node_type = source_info.get("type", "unknown")
+                G.add_node(source, type=node_type)
+            # Add edge
+            G.add_edge(source, entity_id, type=rel_type)
+    # Use a force-directed layout
+    pos = nx.spring_layout(G, k=0.5, iterations=50)
+    # Create Plotly figure
+    fig = go.Figure()
+    # Add edges with curved lines
+    for source, target, data in G.edges(data=True):
+        x0, y0 = pos[source]
+        x1, y1 = pos[target]
+        rel_type = data.get("type", "unknown")
+        # Calculate edge midpoint for label
+        mid_x = (x0 + x1) / 2
+        mid_y = (y0 + y1) / 2
+        # Draw edge
+        fig.add_trace(go.Scatter(
+            x=[x0, x1],
+            y=[y0, y1],
+            mode="lines",
+            line=dict(width=1, color="#888"),
+            hoverinfo="text",
+            hovertext=f"Relationship: {rel_type}",
+            showlegend=False
+        ))
+        # Add relationship label
+        fig.add_trace(go.Scatter(
+            x=[mid_x],
+            y=[mid_y],
+            mode="text",
+            text=[rel_type],
+            textposition="middle center",
+            textfont=dict(size=10, color="#555"),
+            hoverinfo="none",
+            showlegend=False
+        ))
+    # Add nodes with different colors by type
+    node_groups = defaultdict(list)
+    for node, data in G.nodes(data=True):
+        node_type = data.get("type", "unknown")
+        node_info = ontology_manager.get_entity_info(node)
+        # Get friendly name if available
+        name = node
+        if "properties" in node_info and "name" in node_info["properties"]:
+            name = node_info["properties"]["name"]
+        node_groups[node_type].append({
+            "id": node,
+            "name": name,
+            "x": pos[node][0],
+            "y": pos[node][1],
+            "info": node_info
+        })
+    # Define colors for different node types
+    colors = {
+        "central": "#ff7f0e",  # Highlighted color for central entity
+        "instance": "#1f77b4",
+        "class": "#2ca02c",
+        "unknown": "#d62728"
+    }
+    # Add each node group with appropriate styling
+    for node_type, nodes in node_groups.items():
+        # Default to unknown color if type not in map
+        color = colors.get(node_type, colors["unknown"])
+        x = [node["x"] for node in nodes]
+        y = [node["y"] for node in nodes]
+        text = [node["name"] for node in nodes]
+        # Prepare hover text
+        hover_text = []
+        for node in nodes:
+            info = node["info"]
+            hover = f"ID: {node['id']}<br>Name: {node['name']}"
+            if "class_type" in info:
+                hover += f"<br>Type: {info['class_type']}"
+            hover_text.append(hover)
+        # Adjust size for central entity
+        size = 20 if node_type == "central" else 15
+        fig.add_trace(go.Scatter(
+            x=x,
+            y=y,
+            mode="markers+text",
+            marker=dict(
+                size=size,
+                color=color,
+                line=dict(width=2, color="white")
+            ),
+            text=text,
+            textposition="bottom center",
+            hoverinfo="text",
+            hovertext=hover_text,
+            name=node_type.capitalize()
+        ))
+    # Update layout
+    fig.update_layout(
+        title=f"Relationships for {entity_id}",
+        title_x=0.5,
+        showlegend=True,
+        hovermode="closest",
+        margin=dict(b=20, l=5, r=5, t=40),
+        xaxis=dict(showgrid=False, zeroline=False, showticklabels=False),
+        yaxis=dict(showgrid=False, zeroline=False, showticklabels=False),
+        height=500
+    )
+    st.plotly_chart(fig, use_container_width=True)
+def display_graph_visualization(knowledge_graph, central_entity=None, max_distance=2):
+    """Display an interactive visualization of the knowledge graph."""
+    st.subheader("🕸️ Knowledge Graph Visualization")
+    # Controls for the visualization
+    with st.expander("Visualization Settings", expanded=True):
+        col1, col2, col3 = st.columns(3)
+        with col1:
+            include_classes = st.checkbox("Include Classes", value=True)
+        with col2:
+            include_instances = st.checkbox("Include Instances", value=True)
+        with col3:
+            include_properties = st.checkbox("Include Properties", value=False)
+        st.markdown("---")
+        col1, col2 = st.columns(2)
+        with col1:
+            max_distance = st.slider("Max Relationship Distance", 1, 5, max_distance)
+        with col2:
+            layout_algorithm = st.selectbox(
+                "Layout Algorithm",
+                ["Force-Directed", "Hierarchical", "Radial", "Circular"],
+                index=0
+            )
+    # Generate HTML visualization
+    html = knowledge_graph.generate_html_visualization(
+        include_classes=include_classes,
+        include_instances=include_instances,
+        central_entity=central_entity,
+        max_distance=max_distance,
+        include_properties=include_properties,
+        layout_algorithm=layout_algorithm.lower()
+    )
+    # Render the HTML
+    render_html_in_streamlit(html)
+    # Entity filter
+    with st.expander("Focus on Entity", expanded=central_entity is not None):
+        # Get all entities
+        entities = []
+        for class_name in knowledge_graph.ontology_manager.get_classes():
+            entities.extend(knowledge_graph.ontology_manager.get_instances_of_class(class_name))
+        # Deduplicate
+        entities = sorted(set(entities))
+        # Select entity
+        selected_entity = st.selectbox(
+            "Select Entity to Focus On",
+            ["None"] + entities,
+            index=0 if central_entity is None else entities.index(central_entity) + 1
+        )
+        if selected_entity != "None":
+            st.button("Focus Graph", on_click=lambda: st.experimental_rerun())
+    # Display graph statistics
+    stats = knowledge_graph.get_graph_statistics()
+    if stats:
+        st.markdown("### Graph Statistics")
+        col1, col2, col3, col4 = st.columns(4)
+        col1.metric("Nodes", stats.get("node_count", 0))
+        col2.metric("Edges", stats.get("edge_count", 0))
+        col3.metric("Classes", stats.get("class_count", 0))
+        col4.metric("Instances", stats.get("instance_count", 0))
+        # Display relationship counts
+        if "relationship_counts" in stats:
+            rel_counts = stats["relationship_counts"]
+            rel_data = [{"Relationship": rel, "Count": count} for rel, count in rel_counts.items()
+                      if rel not in ["subClassOf", "instanceOf"]]  # Filter out structural relationships
+            if rel_data:
+                df = pd.DataFrame(rel_data)
+                fig = px.bar(df,
+                           x="Relationship",
+                           y="Count",
+                           title="Relationship Distribution",
+                           color="Count",
+                           color_continuous_scale="viridis")
+                st.plotly_chart(fig, use_container_width=True)
+def visualize_path(path_info, ontology_manager):
+    """Visualize a semantic path between entities with enhanced graphics and details."""
+    if not path_info or "path" not in path_info:
+        st.warning("No path information available.")
+        return
+    st.subheader("🔄 Semantic Path Visualization")
+    path = path_info["path"]
+    # Get entity information for each node in the path
+    entities = {}
+    all_nodes = set()
+    # Add source and target
+    if "source" in path_info:
+        source_id = path_info["source"]
+        all_nodes.add(source_id)
+        entities[source_id] = ontology_manager.get_entity_info(source_id)
+    if "target" in path_info:
+        target_id = path_info["target"]
+        all_nodes.add(target_id)
+        entities[target_id] = ontology_manager.get_entity_info(target_id)
+    # Add all entities in the path
+    for edge in path:
+        source_id = edge["source"]
+        target_id = edge["target"]
+        all_nodes.add(source_id)
+        all_nodes.add(target_id)
+        if source_id not in entities:
+            entities[source_id] = ontology_manager.get_entity_info(source_id)
+        if target_id not in entities:
+            entities[target_id] = ontology_manager.get_entity_info(target_id)
+    # Create tabs for different views
+    tab1, tab2, tab3 = st.tabs(["Path Visualization", "Entity Details", "Path Summary"])
+    with tab1:
+        # Display path as a sequence diagram
+        display_path_visualization(path, entities)
+    with tab2:
+        # Display details of entities in the path
+        st.markdown("### Entities in Path")
+        # Group entities by type
+        entities_by_type = defaultdict(list)
+        for entity_id in all_nodes:
+            entity_info = entities.get(entity_id, {})
+            entity_type = entity_info.get("class_type", entity_info.get("class", "Unknown"))
+            entities_by_type[entity_type].append((entity_id, entity_info))
+        # Create an expander for each entity type
+        for entity_type, entity_list in entities_by_type.items():
+            with st.expander(f"{entity_type} ({len(entity_list)})", expanded=True):
+                for entity_id, entity_info in entity_list:
+                    st.markdown(f"**{entity_id}**")
+                    # Display properties if available
+                    if "properties" in entity_info and entity_info["properties"]:
+                        props_markdown = ", ".join([f"**{k}**: {v}" for k, v in entity_info["properties"].items()])
+                        st.markdown(props_markdown)
+                    st.markdown("---")
+    with tab3:
+        # Display textual summary of the path
+        st.markdown("### Path Description")
+        # If path_info has text, use it
+        if "text" in path_info and path_info["text"]:
+            st.markdown(f"**Path:** {path_info['text']}")
+        else:
+            # Otherwise, generate a description
+            path_steps = []
+            for edge in path:
+                source_id = edge["source"]
+                target_id = edge["target"]
+                relation = edge["type"]
+                # Get readable names if available
+                source_name = source_id
+                target_name = target_id
+                if source_id in entities and "properties" in entities[source_id]:
+                    props = entities[source_id]["properties"]
+                    if "name" in props:
+                        source_name = props["name"]
+                if target_id in entities and "properties" in entities[target_id]:
+                    props = entities[target_id]["properties"]
+                    if "name" in props:
+                        target_name = props["name"]
+                path_steps.append(f"{source_name} **{relation}** {target_name}")
+            st.markdown(" → ".join(path_steps))
+        # Display relevant business rules
+        relevant_rules = find_relevant_rules_for_path(path, ontology_manager)
+        if relevant_rules:
+            st.markdown("### Relevant Business Rules")
+            for rule in relevant_rules:
+                st.markdown(f"- **{rule['id']}**: {rule['description']}")
+def display_path_visualization(path, entities):
+    """Create an enhanced visual representation of the path."""
+    if not path:
+        st.info("Path is empty.")
+        return
+    # Create nodes and positions
+    nodes = []
+    x_positions = {}
+    # Collect all unique nodes in the path
+    unique_nodes = set()
+    for edge in path:
+        unique_nodes.add(edge["source"])
+        unique_nodes.add(edge["target"])
+    # Create ordered list of nodes
+    path_nodes = []
+    if path:
+        # Start with the first source
+        current_node = path[0]["source"]
+        path_nodes.append(current_node)
+        # Follow the path
+        for edge in path:
+            target = edge["target"]
+            path_nodes.append(target)
+            current_node = target
+    else:
+        # If no path, just use the unique nodes
+        path_nodes = list(unique_nodes)
+    # Assign positions along a line
+    for i, node_id in enumerate(path_nodes):
+        x_positions[node_id] = i
+        # Get node info
+        entity_info = entities.get(node_id, {})
+        properties = entity_info.get("properties", {})
+        entity_type = entity_info.get("class_type", entity_info.get("class", "Unknown"))
+        # Get display name
+        name = properties.get("name", node_id)
+        nodes.append({
+            "id": node_id,
+            "name": name,
+            "type": entity_type,
+            "properties": properties
+        })
+    # Create Plotly figure for horizontal path
+    fig = go.Figure()
+    # Add nodes
+    node_x = []
+    node_y = []
+    node_text = []
+    node_hover = []
+    node_colors = []
+    # Color mapping for entity types
+    color_map = {}
+    for node in nodes:
+        node_type = node["type"]
+        if node_type not in color_map:
+            # Assign colors from a categorical colorscale
+            idx = len(color_map) % len(px.colors.qualitative.Plotly)
+            color_map[node_type] = px.colors.qualitative.Plotly[idx]
+    for node in nodes:
+        node_x.append(x_positions[node["id"]])
+        node_y.append(0)  # All nodes at y=0 for a horizontal path
+        node_text.append(node["name"])
+        # Create detailed hover text
+        hover = f"{node['id']}<br>{node['type']}"
+        for k, v in node["properties"].items():
+            hover += f"<br>{k}: {v}"
+        node_hover.append(hover)
+        # Set node color by type
+        node_colors.append(color_map.get(node["type"], "#7f7f7f"))
+    # Add node trace
+    fig.add_trace(go.Scatter(
+        x=node_x,
+        y=node_y,
+        mode="markers+text",
+        marker=dict(
+            size=30,
+            color=node_colors,
+            line=dict(width=2, color="DarkSlateGrey")
+        ),
+        text=node_text,
+        textposition="bottom center",
+        hovertext=node_hover,
+        hoverinfo="text",
+        name="Entities"
+    ))
+    # Add edges with relationship labels
+    for edge in path:
+        source = edge["source"]
+        target = edge["target"]
+        edge_type = edge["type"]
+        source_pos = x_positions[source]
+        target_pos = x_positions[target]
+        # Add edge line
+        fig.add_trace(go.Scatter(
+            x=[source_pos, target_pos],
+            y=[0, 0],
+            mode="lines",
+            line=dict(width=2, color="#888"),
+            hoverinfo="none",
+            showlegend=False
+        ))
+        # Add relationship label above the line
+        fig.add_trace(go.Scatter(
+            x=[(source_pos + target_pos) / 2],
+            y=[0.1],  # Slightly above the line
+            mode="text",
+            text=[edge_type],
+            textposition="top center",
+            hoverinfo="none",
+            showlegend=False
+        ))
+    # Update layout
+    fig.update_layout(
+        title="Path Visualization",
+        showlegend=False,
+        hovermode="closest",
+        margin=dict(b=40, l=20, r=20, t=40),
+        xaxis=dict(showgrid=False, zeroline=False, showticklabels=False),
+        yaxis=dict(showgrid=False, zeroline=False, showticklabels=False),
+        height=300,
+        plot_bgcolor="white"
+    )
+    # Add a legend for entity types
+    for entity_type, color in color_map.items():
+        fig.add_trace(go.Scatter(
+            x=[None],
+            y=[None],
+            mode="markers",
+            marker=dict(size=10, color=color),
+            name=entity_type,
+            showlegend=True
+        ))
+    fig.update_layout(legend=dict(
+        orientation="h",
+        yanchor="bottom",
+        y=-0.3,
+        xanchor="center",
+        x=0.5
+    ))
+    st.plotly_chart(fig, use_container_width=True)
+    # Add step-by-step description
+    st.markdown("### Step-by-Step Path")
+    for i, edge in enumerate(path):
+        source = edge["source"]
+        target = edge["target"]
+        relation = edge["type"]
+        # Get display names
+        source_info = entities.get(source, {})
+        target_info = entities.get(target, {})
+        source_name = source
+        if "properties" in source_info and "name" in source_info["properties"]:
+            source_name = source_info["properties"]["name"]
+        target_name = target
+        if "properties" in target_info and "name" in target_info["properties"]:
+            target_name = target_info["properties"]["name"]
+        st.markdown(f"**Step {i+1}:** {source_name} ({source}) **{relation}** {target_name} ({target})")
+def find_relevant_rules_for_path(path, ontology_manager):
+    """Find business rules relevant to the entities and relationships in a path."""
+    rules = ontology_manager.ontology_data.get("rules", [])
+    if not rules:
+        return []
+    # Extract entities and relationships from the path
+    entity_types = set()
+    relationship_types = set()
+    for edge in path:
+        source = edge["source"]
+        target = edge["target"]
+        relation = edge["type"]
+        # Get entity info
+        source_info = ontology_manager.get_entity_info(source)
+        target_info = ontology_manager.get_entity_info(target)
+        # Add entity types
+        if "class_type" in source_info:
+            entity_types.add(source_info["class_type"])
+        if "class_type" in target_info:
+            entity_types.add(target_info["class_type"])
+        # Add relationship type
+        relationship_types.add(relation)
+    # Find rules that mention these entities or relationships
+    relevant_rules = []
+    for rule in rules:
+        rule_text = json.dumps(rule).lower()
+        # Check if rule mentions any of the entity types or relationships
+        is_relevant = False
+        for entity_type in entity_types:
+            if entity_type.lower() in rule_text:
+                is_relevant = True
+                break
+        if not is_relevant:
+            for rel_type in relationship_types:
+                if rel_type.lower() in rule_text:
+                    is_relevant = True
+                    break
+        if is_relevant:
+            relevant_rules.append(rule)
+    return relevant_rules
+def display_reasoning_trace(query: str, retrieved_docs: List[Dict], answer: str, ontology_manager):
+    """Display an enhanced trace of how ontological reasoning was used to answer the query."""
+    st.subheader("🧠 Ontology-Enhanced Reasoning")
+    # Create a multi-tab interface for different aspects of reasoning
+    tab1, tab2, tab3 = st.tabs(["Query Analysis", "Knowledge Retrieval", "Reasoning Path"])
+    with tab1:
+        # Extract entity and relationship mentions with confidence
+        entity_mentions, relationship_mentions = analyze_query_ontology_concepts(query, ontology_manager)
+        # Display detected entities with confidence scores
+        if entity_mentions:
+            st.markdown("### Entities Detected in Query")
+            # Convert to DataFrame for visualization
+            entity_df = pd.DataFrame([{
+                "Entity Type": e["type"],
+                "Confidence": e["confidence"],
+                "Description": e["description"]
+            } for e in entity_mentions])
+            # Sort by confidence
+            entity_df = entity_df.sort_values("Confidence", ascending=False)
+            # Create a horizontal bar chart
+            fig = px.bar(entity_df,
+                        x="Confidence",
+                        y="Entity Type",
+                        orientation='h',
+                        title="Entity Type Detection Confidence",
+                        color="Confidence",
+                        color_continuous_scale="Blues",
+                        text="Confidence")
+            fig.update_traces(texttemplate='%{text:.0%}', textposition='outside')
+            fig.update_layout(xaxis_tickformat=".0%")
+            st.plotly_chart(fig, use_container_width=True)
+            # Display descriptions
+            st.subheader("Entity Type Descriptions")
+            st.dataframe(
+                entity_df[["Entity Type", "Description"]],
+                hide_index=True
+            )
+        # Display detected relationships
+        if relationship_mentions:
+            st.markdown("### Relationships Detected in Query")
+            # Convert to DataFrame
+            rel_df = pd.DataFrame([{
+                "Relationship": r["name"],
+                "From": r["domain"],
+                "To": r["range"],
+                "Confidence": r["confidence"],
+                "Description": r["description"]
+            } for r in relationship_mentions])
+            # Sort by confidence
+            rel_df = rel_df.sort_values("Confidence", ascending=False)
+            # Create visualization
+            fig = px.bar(rel_df,
+                        x="Confidence",
+                        y="Relationship",
+                        orientation='h',
+                        title="Relationship Detection Confidence",
+                        color="Confidence",
+                        color_continuous_scale="Reds",
+                        text="Confidence")
+            fig.update_traces(texttemplate='%{text:.0%}', textposition='outside')
+            fig.update_layout(xaxis_tickformat=".0%")
+            st.plotly_chart(fig, use_container_width=True)
+            # Display relationship details
+            st.subheader("Relationship Details")
+            st.dataframe(
+                rel_df[["Relationship", "From", "To", "Description"]],
+                hide_index=True
+            )
+    with tab2:
+        # Create an enhanced visualization of the retrieval process
+        st.markdown("### Knowledge Retrieval Process")
+        # Group retrieved documents by source
+        docs_by_source = defaultdict(list)
+        for doc in retrieved_docs:
+            if hasattr(doc, 'metadata'):
+                source = doc.metadata.get('source', 'unknown')
+                docs_by_source[source].append(doc)
+            else:
+                docs_by_source['unknown'].append(doc)
+        # Display retrieval visualization
+        col1, col2 = st.columns([2, 1])
+        with col1:
+            # Create a Sankey diagram to show flow from query to sources to answer
+            display_retrieval_flow(query, docs_by_source)
+        with col2:
+            # Display source distribution
+            source_counts = {source: len(docs) for source, docs in docs_by_source.items()}
+            # Create a pie chart
+            fig = px.pie(
+                values=list(source_counts.values()),
+                names=list(source_counts.keys()),
+                title="Retrieved Context Sources",
+                color_discrete_sequence=px.colors.qualitative.Plotly
+            )
+            st.plotly_chart(fig, use_container_width=True)
+        # Display retrieved document details in expandable sections
+        for source, docs in docs_by_source.items():
+            with st.expander(f"{source.capitalize()} ({len(docs)})", expanded=source == "ontology_context"):
+                for i, doc in enumerate(docs):
+                    # Add separator between documents
+                    if i > 0:
+                        st.markdown("---")
+                    # Display document content
+                    if hasattr(doc, 'page_content'):
+                        st.markdown(f"**Content:**")
+                        # Format depending on source
+                        if source in ["ontology", "ontology_context"]:
+                            st.markdown(doc.page_content)
+                        else:
+                            st.code(doc.page_content)
+                    # Display metadata if present
+                    if hasattr(doc, 'metadata') and doc.metadata:
+                        st.markdown("**Metadata:**")
+                        for key, value in doc.metadata.items():
+                            if key != 'source':  # Already shown in section title
+                                st.markdown(f"- **{key}**: {value}")
+    with tab3:
+        # Show the reasoning flow from query to answer
+        st.markdown("### Ontological Reasoning Process")
+        # Display reasoning steps
+        reasoning_steps = generate_reasoning_steps(query, entity_mentions, relationship_mentions, retrieved_docs, answer)
+        for i, step in enumerate(reasoning_steps):
+            with st.expander(f"Step {i+1}: {step['title']}", expanded=i == 0):
+                st.markdown(step["description"])
+        # Visualization of how ontological structure influenced the answer
+        st.markdown("### How Ontology Enhanced the Answer")
+        # Display ontology advantage explanation
+        advantages = explain_ontology_advantages(entity_mentions, relationship_mentions)
+        for adv in advantages:
+            st.markdown(f"**{adv['title']}**")
+            st.markdown(adv["description"])
+def analyze_query_ontology_concepts(query: str, ontology_manager) -> Tuple[List[Dict], List[Dict]]:
+    """
+    Analyze the query to identify ontology concepts with confidence scores.
+    This is a simplified implementation that would be replaced with NLP in production.
+    """
+    query_lower = query.lower().split()
+    # Entity detection
+    entity_mentions = []
+    classes = ontology_manager.get_classes()
+    for class_name in classes:
+        # Simple token matching (would use NER in production)
+        if class_name.lower() in query_lower:
+            # Get class info
+            class_info = ontology_manager.ontology_data["classes"].get(class_name, {})
+            # Assign a confidence score (this would be from an ML model in production)
+            # Here we use a simple heuristic based on word length and specificity
+            confidence = min(0.95, 0.5 + (len(class_name) / 20))
+            entity_mentions.append({
+                "type": class_name,
+                "confidence": confidence,
+                "description": class_info.get("description", "")
+            })
+    # Relationship detection
+    relationship_mentions = []
+    relationships = ontology_manager.ontology_data.get("relationships", [])
+    for rel in relationships:
+        rel_name = rel["name"]
+        # Simple token matching
+        if rel_name.lower() in query_lower:
+            # Assign confidence
+            confidence = min(0.9, 0.5 + (len(rel_name) / 20))
+            relationship_mentions.append({
+                "name": rel_name,
+                "domain": rel["domain"],
+                "range": rel["range"],
+                "confidence": confidence,
+                "description": rel.get("description", "")
+            })
+    return entity_mentions, relationship_mentions
+def display_retrieval_flow(query: str, docs_by_source: Dict[str, List]):
+    """Create a Sankey diagram showing the flow from query to sources to answer."""
+    # Define node labels
+    nodes = ["Query"]
+    # Add source nodes
+    for source in docs_by_source.keys():
+        nodes.append(f"Source: {source.capitalize()}")
+    nodes.append("Answer")
+    # Define links
+    source_indices = []
+    target_indices = []
+    values = []
+    # Links from query to sources
+    for i, (source, docs) in enumerate(docs_by_source.items()):
+        source_indices.append(0)  # Query is index 0
+        target_indices.append(i + 1)  # Source indices start at 1
+        values.append(len(docs))  # Width based on number of docs
+    # Links from sources to answer
+    for i in range(len(docs_by_source)):
+        source_indices.append(i + 1)  # Source index
+        target_indices.append(len(nodes) - 1)  # Answer is last node
+        values.append(values[i])  # Same width as query to source
+    # Create Sankey diagram
+    fig = go.Figure(data=[go.Sankey(
+        node=dict(
+            pad=15,
+            thickness=20,
+            line=dict(color="black", width=0.5),
+            label=nodes,
+            color=["#1f77b4"] + [px.colors.qualitative.Plotly[i % len(px.colors.qualitative.Plotly)]
+                               for i in range(len(docs_by_source))] + ["#2ca02c"]
+        ),
+        link=dict(
+            source=source_indices,
+            target=target_indices,
+            value=values
+        )
+    )])
+    fig.update_layout(
+        title="Information Flow in RAG Process",
+        font=dict(size=12)
+    )
+    st.plotly_chart(fig, use_container_width=True)
+def generate_reasoning_steps(query: str, entity_mentions: List[Dict], relationship_mentions: List[Dict],
+                            retrieved_docs: List[Dict], answer: str) -> List[Dict]:
+    """Generate reasoning steps to explain how the system arrived at the answer."""
+    steps = []
+    # Step 1: Query Understanding
+    steps.append({
+        "title": "Query Understanding",
+        "description": f"""The system analyzes the query "{query}" and identifies key concepts from the ontology.
+        {len(entity_mentions)} entity types and {len(relationship_mentions)} relationship types are recognized, allowing
+        the system to understand the semantic context of the question."""
+    })
+    # Step 2: Knowledge Retrieval
+    if retrieved_docs:
+        doc_count = len(retrieved_docs)
+        ontology_count = sum(1 for doc in retrieved_docs if hasattr(doc, 'metadata') and
+                          doc.metadata.get('source', '') in ['ontology', 'ontology_context'])
+        steps.append({
+            "title": "Knowledge Retrieval",
+            "description": f"""Based on the identified concepts, the system retrieves {doc_count} relevant pieces of information,
+            including {ontology_count} from the structured ontology. This hybrid approach combines traditional vector retrieval
+            with ontology-aware semantic retrieval, enabling access to both explicit and implicit knowledge."""
+        })
+    # Step 3: Relationship Traversal
+    if relationship_mentions:
+        rel_names = [r["name"] for r in relationship_mentions]
+        steps.append({
+            "title": "Relationship Traversal",
+            "description": f"""The system identifies key relationships in the ontology: {', '.join(rel_names)}.
+            By traversing these relationships, the system can connect concepts that might not appear together in the same text,
+            allowing for multi-hop reasoning across the knowledge graph."""
+        })
+    # Step 4: Ontological Inference
+    if entity_mentions:
+        entity_types = [e["type"] for e in entity_mentions]
+        steps.append({
+            "title": "Ontological Inference",
+            "description": f"""Using the hierarchical structure of entities like {', '.join(entity_types)},
+            the system makes inferences based on class inheritance and relationship constraints defined in the ontology.
+            This allows it to reason about properties and relationships that might not be explicitly stated."""
+        })
+    # Step 5: Answer Generation
+    steps.append({
+        "title": "Answer Synthesis",
+        "description": f"""Finally, the system synthesizes the retrieved information and ontological knowledge to generate a comprehensive answer.
+        The structured nature of the ontology ensures that the answer accurately reflects the relationships between concepts
+        and respects the business rules defined in the knowledge model."""
+    })
+    return steps
+def explain_ontology_advantages(entity_mentions: List[Dict], relationship_mentions: List[Dict]) -> List[Dict]:
+    """Explain how ontology enhanced the RAG process."""
+    advantages = []
+    if entity_mentions:
+        advantages.append({
+            "title": "Hierarchical Knowledge Representation",
+            "description": """The ontology provides a hierarchical class structure that enables the system to understand
+            that concepts are related through is-a relationships. For instance, knowing that a Manager is an Employee
+            allows the system to apply Employee-related knowledge when answering questions about Managers, even if
+            the specific information was only stated for Employees in general."""
+        })
+    if relationship_mentions:
+        advantages.append({
+            "title": "Explicit Relationship Semantics",
+            "description": """The ontology defines explicit relationships between concepts with clear semantics.
+            This allows the system to understand how entities are connected beyond simple co-occurrence in text.
+            For example, understanding that 'ownedBy' connects Products to Departments helps answer questions
+            about product ownership and departmental responsibilities."""
+        })
+    advantages.append({
+        "title": "Constraint-Based Reasoning",
+        "description": """Business rules in the ontology provide constraints that guide the reasoning process.
+        These rules ensure the system's answers are consistent with the organization's policies and practices.
+        For instance, rules about approval workflows or data classification requirements can inform answers
+        about process-related questions."""
+    })
+    advantages.append({
+        "title": "Cross-Domain Knowledge Integration",
+        "description": """The ontology connects concepts across different domains of the enterprise, enabling
+        integrated reasoning that traditional document-based retrieval might miss. This allows the system to
+        answer questions that span organizational boundaries, such as how marketing decisions affect product
+        development or how customer feedback influences business strategy."""
+    })
+    return advantages

static/css/styles.css ADDED Viewed

	@@ -0,0 +1,83 @@

+/* Custom styling for ontology-RAG application */
+/* Main container styles */
+.main-container {
+    padding: 20px;
+    max-width: 1200px;
+    margin: 0 auto;
+}
+/* Enhance visualization elements */
+.vis-network {
+    border: 1px solid #ddd;
+    border-radius: 8px;
+    box-shadow: 0 2px 8px rgba(0, 0, 0, 0.1);
+}
+/* Custom tooltip styling */
+.vis-tooltip {
+    position: absolute;
+    background-color: rgba(255, 255, 255, 0.95);
+    border: 1px solid #ccc;
+    border-radius: 5px;
+    padding: 12px;
+    font-family: Arial, sans-serif;
+    font-size: 13px;
+    color: #333;
+    max-width: 350px;
+    z-index: 9999;
+    box-shadow: 0 4px 8px rgba(0, 0, 0, 0.15);
+}
+/* Enhance legend appearance */
+.graph-legend {
+    background-color: rgba(255, 255, 255, 0.9) !important;
+    border: 1px solid #eee !important;
+    border-radius: 8px !important;
+    box-shadow: 0 2px 6px rgba(0, 0, 0, 0.1) !important;
+}
+/* Styling for entity detail cards */
+.entity-detail-card {
+    border: 1px solid #eee;
+    border-radius: 5px;
+    padding: 15px;
+    margin-bottom: 15px;
+    box-shadow: 0 2px 4px rgba(0, 0, 0, 0.05);
+}
+/* Highlight for central entities */
+.central-entity {
+    border-left: 4px solid #ff7f0e;
+    padding-left: 12px;
+}
+/* Enhanced path visualization */
+.path-step {
+    padding: 8px;
+    margin: 8px 0;
+    border-left: 3px solid #1f77b4;
+    background-color: #f8f9fa;
+}
+/* Customization for Streamlit components */
+.stButton button {
+    border-radius: 20px;
+    padding: 5px 15px;
+}
+.stSelectbox label {
+    font-weight: 500;
+}
+/* Tabs customization */
+.streamlit-tabs .stTabs [role="tab"] {
+    font-size: 15px;
+    padding: 8px 16px;
+}
+/* Expander customization */
+.streamlit-expanderContent {
+    border-left: 1px solid #ddd;
+    padding-left: 10px;
+}