Spaces:
Running
Running
github-actions[bot]
commited on
Commit
Β·
765d69e
1
Parent(s):
4bf6c32
Update Hugging Face README
Browse files
README.md
CHANGED
@@ -1,3 +1,13 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
# CrawlGPT π€
|
2 |
|
3 |
A powerful web content crawler with LLM-powered RAG (Retrieval Augmented Generation) capabilities. CrawlGPT extracts content from URLs, processes it through intelligent summarization, and enables natural language interactions using modern LLM technology.
|
@@ -124,41 +134,40 @@ _Example of CRAWLGPT in action!_
|
|
124 |
crawlgpt/
|
125 |
βββ src/
|
126 |
β βββ crawlgpt/
|
127 |
-
β βββ core/
|
128 |
β β βββ database.py # SQL database handling
|
129 |
-
β β βββ LLMBasedCrawler.py
|
130 |
β β βββ DatabaseHandler.py # Vector database (FAISS)
|
131 |
β β βββ SummaryGenerator.py # Text summarization
|
132 |
-
β βββ ui/
|
133 |
-
β β βββ chat_app.py
|
134 |
-
β β βββ chat_ui.py
|
135 |
-
β β βββ login.py
|
136 |
-
β βββ utils/
|
137 |
β βββ content_validator.py # URL/content validation
|
138 |
-
β βββ data_manager.py
|
139 |
β βββ helper_functions.py # General helpers
|
140 |
-
β βββ monitoring.py
|
141 |
-
β βββ progress.py
|
142 |
βββ tests/ # Test suite
|
143 |
β βββ test_core/
|
144 |
β βββ test_database_handler.py # Vector DB tests
|
145 |
-
β βββ test_integration.py
|
146 |
-
β βββ test_llm_based_crawler.py
|
147 |
-
β βββ test_summary_generator.py
|
148 |
βββ .github/ # CI/CD
|
149 |
β βββ workflows/
|
150 |
β βββ Push_to_hf.yaml # HuggingFace sync
|
151 |
βββ Docs/
|
152 |
-
β βββ MiniDoc.md
|
153 |
-
βββ .dockerignore
|
154 |
-
βββ .gitignore
|
155 |
-
βββ Dockerfile
|
156 |
-
βββ LICENSE
|
157 |
βββ README.md # Project documentation
|
158 |
βββ README_hf.md # HuggingFace README
|
159 |
βββ pyproject.toml # Project metadata
|
160 |
βββ pytest.ini # Test configuration
|
161 |
-
βββ crawlgpt.db # Database
|
162 |
βββ setup_env.py # Environment setup
|
163 |
```
|
164 |
|
@@ -206,4 +215,4 @@ Contributions are welcome! Please feel free to submit a Pull Request. For major
|
|
206 |
```
|
207 |
git push origin feature/AmazingFeature
|
208 |
```
|
209 |
-
5. Open a Pull Request.
|
|
|
1 |
+
---
|
2 |
+
license: mit
|
3 |
+
title: CRAWLGPT
|
4 |
+
sdk: docker
|
5 |
+
emoji: π»
|
6 |
+
colorFrom: pink
|
7 |
+
colorTo: blue
|
8 |
+
pinned: true
|
9 |
+
short_description: A powerful web content crawler with LLM-powered RAG.
|
10 |
+
---
|
11 |
# CrawlGPT π€
|
12 |
|
13 |
A powerful web content crawler with LLM-powered RAG (Retrieval Augmented Generation) capabilities. CrawlGPT extracts content from URLs, processes it through intelligent summarization, and enables natural language interactions using modern LLM technology.
|
|
|
134 |
crawlgpt/
|
135 |
βββ src/
|
136 |
β βββ crawlgpt/
|
137 |
+
β βββ core/ # Core functionality
|
138 |
β β βββ database.py # SQL database handling
|
139 |
+
β β βββ LLMBasedCrawler.py # Main crawler implementation
|
140 |
β β βββ DatabaseHandler.py # Vector database (FAISS)
|
141 |
β β βββ SummaryGenerator.py # Text summarization
|
142 |
+
β βββ ui/ # User Interface
|
143 |
+
β β βββ chat_app.py # Main Streamlit app
|
144 |
+
β β βββ chat_ui.py # Development UI
|
145 |
+
β β βββ login.py # Authentication UI
|
146 |
+
β βββ utils/ # Utilities
|
147 |
β βββ content_validator.py # URL/content validation
|
148 |
+
β βββ data_manager.py # Import/export handling
|
149 |
β βββ helper_functions.py # General helpers
|
150 |
+
β βββ monitoring.py # Metrics collection
|
151 |
+
β βββ progress.py # Progress tracking
|
152 |
βββ tests/ # Test suite
|
153 |
β βββ test_core/
|
154 |
β βββ test_database_handler.py # Vector DB tests
|
155 |
+
β βββ test_integration.py # Integration tests
|
156 |
+
β βββ test_llm_based_crawler.py # Crawler tests
|
157 |
+
β βββ test_summary_generator.py # Summarizer tests
|
158 |
βββ .github/ # CI/CD
|
159 |
β βββ workflows/
|
160 |
β βββ Push_to_hf.yaml # HuggingFace sync
|
161 |
βββ Docs/
|
162 |
+
β βββ MiniDoc.md # Documentation
|
163 |
+
βββ .dockerignore # Docker exclusions
|
164 |
+
βββ .gitignore # Git exclusions
|
165 |
+
βββ Dockerfile # Container config
|
166 |
+
βββ LICENSE # MIT License
|
167 |
βββ README.md # Project documentation
|
168 |
βββ README_hf.md # HuggingFace README
|
169 |
βββ pyproject.toml # Project metadata
|
170 |
βββ pytest.ini # Test configuration
|
|
|
171 |
βββ setup_env.py # Environment setup
|
172 |
```
|
173 |
|
|
|
215 |
```
|
216 |
git push origin feature/AmazingFeature
|
217 |
```
|
218 |
+
5. Open a Pull Request.
|