Spaces:
Build error
Build error
File size: 2,725 Bytes
7a233a3 b9be4de 7a233a3 b9be4de 7a233a3 db6b619 7a233a3 849b2ae 7a233a3 db6b619 7a233a3 db6b619 9a7da99 7a233a3 db6b619 9a7da99 7a233a3 db6b619 849b2ae 7a233a3 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 |
# DL4DS Tutor π
Check out the configuration reference at [Hugging Face Spaces Config Reference](https://huggingface.co/docs/hub/spaces-config-reference).
You can find an implementation of the Tutor at [DL4DS Tutor on Hugging Face](https://dl4ds-dl4ds-tutor.hf.space/), which is hosted on Hugging Face [here](https://huggingface.co/spaces/dl4ds/dl4ds_tutor).
## Running Locally
1. **Clone the Repository**
```bash
git clone https://github.com/DL4DS/dl4ds_tutor
```
2. **Put your data under the `storage/data` directory**
- Add URLs in the `urls.txt` file.
- Add other PDF files in the `storage/data` directory.
3. **To test Data Loading (Optional)**
```bash
cd code
python -m modules.dataloader.data_loader
```
4. **Create the Vector Database**
```bash
cd code
python -m modules.vectorstore.store_manager
```
- Note: You need to run the above command when you add new data to the `storage/data` directory, or if the `storage/data/urls.txt` file is updated.
- Alternatively, you can set `["vectorstore"]["embedd_files"]` to `True` in the `code/modules/config/config.yaml` file, which will embed files from the storage directory every time you run the below chainlit command.
5. **Run the Chainlit App**
```bash
chainlit run main.py
```
See the [docs](https://github.com/DL4DS/dl4ds_tutor/tree/main/docs) for more information.
## File Structure
```plaintext
code/
βββ modules
β βββ chat # Contains the chatbot implementation
β βββ chat_processor # Contains the implementation to process and log the conversations
β βββ config # Contains the configuration files
β βββ dataloader # Contains the implementation to load the data from the storage directory
β βββ retriever # Contains the implementation to create the retriever
β βββ vectorstore # Contains the implementation to create the vector database
βββ public
β βββ logo_dark.png # Dark theme logo
β βββ logo_light.png # Light theme logo
β βββ test.css # Custom CSS file
βββ main.py
docs/ # Contains the documentation to the codebase and methods used
storage/
βββ data # Store files and URLs here
βββ logs # Logs directory, includes logs on vector DB creation, tutor logs, and chunks logged in JSON files
βββ models # Local LLMs are loaded from here
vectorstores/ # Stores the created vector databases
.env # This needs to be created, store the API keys here
```
|