github-actions[bot] commited on
Commit
765d69e
Β·
1 Parent(s): 4bf6c32

Update Hugging Face README

Browse files
Files changed (1) hide show
  1. README.md +29 -20
README.md CHANGED
@@ -1,3 +1,13 @@
 
 
 
 
 
 
 
 
 
 
1
  # CrawlGPT πŸ€–
2
 
3
  A powerful web content crawler with LLM-powered RAG (Retrieval Augmented Generation) capabilities. CrawlGPT extracts content from URLs, processes it through intelligent summarization, and enables natural language interactions using modern LLM technology.
@@ -124,41 +134,40 @@ _Example of CRAWLGPT in action!_
124
  crawlgpt/
125
  β”œβ”€β”€ src/
126
  β”‚ └── crawlgpt/
127
- β”‚ β”œβ”€β”€ core/ # Core functionality
128
  β”‚ β”‚ β”œβ”€β”€ database.py # SQL database handling
129
- β”‚ β”‚ β”œβ”€β”€ LLMBasedCrawler.py # Main crawler implementation
130
  β”‚ β”‚ β”œβ”€β”€ DatabaseHandler.py # Vector database (FAISS)
131
  β”‚ β”‚ └── SummaryGenerator.py # Text summarization
132
- β”‚ β”œβ”€β”€ ui/ # User Interface
133
- β”‚ β”‚ β”œβ”€β”€ chat_app.py # Main Streamlit app
134
- β”‚ β”‚ β”œβ”€β”€ chat_ui.py # Development UI
135
- β”‚ β”‚ └── login.py # Authentication UI
136
- β”‚ └── utils/ # Utilities
137
  β”‚ β”œβ”€β”€ content_validator.py # URL/content validation
138
- β”‚ β”œβ”€β”€ data_manager.py # Import/export handling
139
  β”‚ β”œβ”€β”€ helper_functions.py # General helpers
140
- β”‚ β”œβ”€β”€ monitoring.py # Metrics collection
141
- β”‚ └── progress.py # Progress tracking
142
  β”œβ”€β”€ tests/ # Test suite
143
  β”‚ └── test_core/
144
  β”‚ β”œβ”€β”€ test_database_handler.py # Vector DB tests
145
- β”‚ β”œβ”€β”€ test_integration.py # Integration tests
146
- β”‚ β”œβ”€β”€ test_llm_based_crawler.py # Crawler tests
147
- β”‚ └── test_summary_generator.py # Summarizer tests
148
  β”œβ”€β”€ .github/ # CI/CD
149
  β”‚ └── workflows/
150
  β”‚ └── Push_to_hf.yaml # HuggingFace sync
151
  β”œβ”€β”€ Docs/
152
- β”‚ └── MiniDoc.md # Documentation
153
- β”œβ”€β”€ .dockerignore # Docker exclusions
154
- β”œβ”€β”€ .gitignore # Git exclusions
155
- β”œβ”€β”€ Dockerfile # Container config
156
- β”œβ”€β”€ LICENSE # MIT License
157
  β”œβ”€β”€ README.md # Project documentation
158
  β”œβ”€β”€ README_hf.md # HuggingFace README
159
  β”œβ”€β”€ pyproject.toml # Project metadata
160
  β”œβ”€β”€ pytest.ini # Test configuration
161
- β”œβ”€β”€ crawlgpt.db # Database
162
  └── setup_env.py # Environment setup
163
  ```
164
 
@@ -206,4 +215,4 @@ Contributions are welcome! Please feel free to submit a Pull Request. For major
206
  ```
207
  git push origin feature/AmazingFeature
208
  ```
209
- 5. Open a Pull Request.
 
1
+ ---
2
+ license: mit
3
+ title: CRAWLGPT
4
+ sdk: docker
5
+ emoji: πŸ’»
6
+ colorFrom: pink
7
+ colorTo: blue
8
+ pinned: true
9
+ short_description: A powerful web content crawler with LLM-powered RAG.
10
+ ---
11
  # CrawlGPT πŸ€–
12
 
13
  A powerful web content crawler with LLM-powered RAG (Retrieval Augmented Generation) capabilities. CrawlGPT extracts content from URLs, processes it through intelligent summarization, and enables natural language interactions using modern LLM technology.
 
134
  crawlgpt/
135
  β”œβ”€β”€ src/
136
  β”‚ └── crawlgpt/
137
+ β”‚ β”œβ”€β”€ core/ # Core functionality
138
  β”‚ β”‚ β”œβ”€β”€ database.py # SQL database handling
139
+ β”‚ β”‚ β”œβ”€β”€ LLMBasedCrawler.py # Main crawler implementation
140
  β”‚ β”‚ β”œβ”€β”€ DatabaseHandler.py # Vector database (FAISS)
141
  β”‚ β”‚ └── SummaryGenerator.py # Text summarization
142
+ β”‚ β”œβ”€β”€ ui/ # User Interface
143
+ β”‚ β”‚ β”œβ”€β”€ chat_app.py # Main Streamlit app
144
+ β”‚ β”‚ β”œβ”€β”€ chat_ui.py # Development UI
145
+ β”‚ β”‚ └── login.py # Authentication UI
146
+ β”‚ └── utils/ # Utilities
147
  β”‚ β”œβ”€β”€ content_validator.py # URL/content validation
148
+ β”‚ β”œβ”€β”€ data_manager.py # Import/export handling
149
  β”‚ β”œβ”€β”€ helper_functions.py # General helpers
150
+ β”‚ β”œβ”€β”€ monitoring.py # Metrics collection
151
+ β”‚ └── progress.py # Progress tracking
152
  β”œβ”€β”€ tests/ # Test suite
153
  β”‚ └── test_core/
154
  β”‚ β”œβ”€β”€ test_database_handler.py # Vector DB tests
155
+ β”‚ β”œβ”€β”€ test_integration.py # Integration tests
156
+ β”‚ β”œβ”€β”€ test_llm_based_crawler.py # Crawler tests
157
+ β”‚ └── test_summary_generator.py # Summarizer tests
158
  β”œβ”€β”€ .github/ # CI/CD
159
  β”‚ └── workflows/
160
  β”‚ └── Push_to_hf.yaml # HuggingFace sync
161
  β”œβ”€β”€ Docs/
162
+ β”‚ └── MiniDoc.md # Documentation
163
+ β”œβ”€β”€ .dockerignore # Docker exclusions
164
+ β”œβ”€β”€ .gitignore # Git exclusions
165
+ β”œβ”€β”€ Dockerfile # Container config
166
+ β”œβ”€β”€ LICENSE # MIT License
167
  β”œβ”€β”€ README.md # Project documentation
168
  β”œβ”€β”€ README_hf.md # HuggingFace README
169
  β”œβ”€β”€ pyproject.toml # Project metadata
170
  β”œβ”€β”€ pytest.ini # Test configuration
 
171
  └── setup_env.py # Environment setup
172
  ```
173
 
 
215
  ```
216
  git push origin feature/AmazingFeature
217
  ```
218
+ 5. Open a Pull Request.