Spaces:
Build error
Build error
XThomasBU
commited on
Commit
β’
7a233a3
1
Parent(s):
d95aad5
updated README
Browse files
README.md
CHANGED
@@ -1,36 +1,61 @@
|
|
1 |
-
|
2 |
-
title: Dl4ds Tutor
|
3 |
-
emoji: π
|
4 |
-
colorFrom: green
|
5 |
-
colorTo: red
|
6 |
-
sdk: docker
|
7 |
-
pinned: false
|
8 |
-
hf_oauth: true
|
9 |
-
---
|
10 |
|
11 |
-
|
12 |
-
===========
|
13 |
|
14 |
-
|
15 |
|
16 |
-
|
17 |
|
18 |
-
|
|
|
|
|
|
|
19 |
|
20 |
-
|
|
|
|
|
21 |
|
22 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
23 |
|
24 |
-
|
25 |
-
```
|
26 |
-
|
27 |
-
|
28 |
-
|
29 |
-
To run the chainlit app, run the following command:
|
30 |
-
```chainlit run main.py```
|
31 |
|
32 |
See the [docs](https://github.com/DL4DS/dl4ds_tutor/tree/main/docs) for more information.
|
33 |
|
34 |
-
##
|
35 |
-
|
36 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# DL4DS Tutor π
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2 |
|
3 |
+
Check out the configuration reference at [Hugging Face Spaces Config Reference](https://huggingface.co/docs/hub/spaces-config-reference).
|
|
|
4 |
|
5 |
+
You can find an implementation of the Tutor at [DL4DS Tutor on Hugging Face](https://dl4ds-dl4ds-tutor.hf.space/), which is hosted on Hugging Face [here](https://huggingface.co/spaces/dl4ds/dl4ds_tutor).
|
6 |
|
7 |
+
## Running Locally
|
8 |
|
9 |
+
1. **Clone the Repository**
|
10 |
+
```bash
|
11 |
+
git clone https://github.com/DL4DS/dl4ds_tutor
|
12 |
+
```
|
13 |
|
14 |
+
2. **Put your data under the `storage/data` directory**
|
15 |
+
- Add URLs in the `urls.txt` file.
|
16 |
+
- Add other PDF files in the `storage/data` directory.
|
17 |
|
18 |
+
3. **Create the Vector Database**
|
19 |
+
```bash
|
20 |
+
cd code
|
21 |
+
python -m modules.vectorstore.store_manager
|
22 |
+
```
|
23 |
+
- Note: You need to run the above command when you add new data to the `storage/data` directory, or if the `storage/data/urls.txt` file is updated.
|
24 |
+
- Alternatively, you can set `["vectorstore"]["embedd_files"]` to `True` in the `code/modules/config/config.yaml` file, which will embed files from the storage directory every time you run the below chainlit command.
|
25 |
|
26 |
+
4. **Run the Chainlit App**
|
27 |
+
```bash
|
28 |
+
chainlit run main.py
|
29 |
+
```
|
|
|
|
|
|
|
30 |
|
31 |
See the [docs](https://github.com/DL4DS/dl4ds_tutor/tree/main/docs) for more information.
|
32 |
|
33 |
+
## File Structure
|
34 |
+
|
35 |
+
```plaintext
|
36 |
+
code/
|
37 |
+
βββ modules
|
38 |
+
β βββ chat # Contains the chatbot implementation
|
39 |
+
β βββ chat_processor # Contains the implementation to process and log the conversations
|
40 |
+
β βββ config # Contains the configuration files
|
41 |
+
β βββ dataloader # Contains the implementation to load the data from the storage directory
|
42 |
+
β βββ retriever # Contains the implementation to create the retriever
|
43 |
+
β βββ vectorstore # Contains the implementation to create the vector database
|
44 |
+
βββ public
|
45 |
+
β βββ logo_dark.png # Dark theme logo
|
46 |
+
β βββ logo_light.png # Light theme logo
|
47 |
+
β βββ test.css # Custom CSS file
|
48 |
+
βββ main.py
|
49 |
+
|
50 |
+
|
51 |
+
docs/ # Contains the documentation to the codebase and methods used
|
52 |
+
|
53 |
+
storage/
|
54 |
+
βββ data # Store files and URLs here
|
55 |
+
βββ logs # Logs directory, includes logs on vector DB creation, tutor logs, and chunks logged in JSON files
|
56 |
+
βββ models # Local LLMs are loaded from here
|
57 |
+
|
58 |
+
vectorstores/ # Stores the created vector databases
|
59 |
+
|
60 |
+
.env # This needs to be created, store the API keys here
|
61 |
+
```
|