dh-mc commited on
Commit
66381c1
·
1 Parent(s): a6081bd
.env.example CHANGED
@@ -54,6 +54,7 @@ USING_TORCH_BFLOAT16=true
54
  # HUGGINGFACE_MODEL_NAME_OR_PATH="meta-llama/Llama-2-13b-chat-hf"
55
  # HUGGINGFACE_MODEL_NAME_OR_PATH="meta-llama/Llama-2-70b-chat-hf"
56
  HUGGINGFACE_MODEL_NAME_OR_PATH="microsoft/Orca-2-7b"
 
57
 
58
  STABLELM_MODEL_NAME_OR_PATH="OpenAssistant/stablelm-7b-sft-v7-epoch-3"
59
 
 
54
  # HUGGINGFACE_MODEL_NAME_OR_PATH="meta-llama/Llama-2-13b-chat-hf"
55
  # HUGGINGFACE_MODEL_NAME_OR_PATH="meta-llama/Llama-2-70b-chat-hf"
56
  HUGGINGFACE_MODEL_NAME_OR_PATH="microsoft/Orca-2-7b"
57
+ # HUGGINGFACE_MODEL_NAME_OR_PATH="microsoft/Orca-2-13b"
58
 
59
  STABLELM_MODEL_NAME_OR_PATH="OpenAssistant/stablelm-7b-sft-v7-epoch-3"
60
 
README.md CHANGED
@@ -9,13 +9,14 @@ app_file: app.py
9
  pinned: false
10
  license: apache-2.0
11
  ---
12
- # ChatPDF - Talk to Your PDF Files
13
 
14
- This project uses Open AI and open-source large language models (LLMs) to enable you to talk to your own PDF files.
 
 
15
 
16
  ## How it works
17
 
18
- We're using an AI methodology, namely Conversational Retrieval Augmentation (CRAG), which uses LLMs off the shelf (i.e., without any fine-tuning), then controls their behavior through clever prompting and conditioning on private “contextual” data, e.g., texts extracted from your PDF files.
19
 
20
  At a very high level, the workflow can be divided into three stages:
21
 
@@ -25,7 +26,7 @@ At a very high level, the workflow can be divided into three stages:
25
 
26
  3. Prompt execution / inference: Once the prompts have been compiled, they are submitted to a pre-trained LLM for inference—including both proprietary model APIs and open-source or self-trained models.
27
 
28
- ![Conversational Retrieval Augmentation (CRAG) - Workflow Overview](./assets/crag-workflow.png)
29
 
30
  Tech stack used includes LangChain, Gradio, Chroma and FAISS.
31
  - LangChain is an open-source framework that makes it easier to build scalable AI/LLM apps and chatbots.
@@ -101,11 +102,11 @@ The source code supports different LLM types - as shown at the top of `.env.exam
101
  # LLM_MODEL_TYPE=gpt4all-j
102
  # LLM_MODEL_TYPE=gpt4all
103
  # LLM_MODEL_TYPE=llamacpp
104
- # LLM_MODEL_TYPE=huggingface
105
  # LLM_MODEL_TYPE=mosaicml
106
  # LLM_MODEL_TYPE=stablelm
107
  # LLM_MODEL_TYPE=openllm
108
- LLM_MODEL_TYPE=hftgi
109
  ```
110
 
111
  - By default, the app runs `microsoft/orca-2-13b` model with HF Text Generation Interface, which runs on a research server and might be down from time to time.
@@ -123,5 +124,7 @@ HUGGINGFACE_MODEL_NAME_OR_PATH="microsoft/orca-2-13b"
123
  # HUGGINGFACE_MODEL_NAME_OR_PATH="meta-llama/Llama-2-7b-chat-hf"
124
  # HUGGINGFACE_MODEL_NAME_OR_PATH="meta-llama/Llama-2-13b-chat-hf"
125
  # HUGGINGFACE_MODEL_NAME_OR_PATH="meta-llama/Llama-2-70b-chat-hf"
 
 
126
  ```
127
 
 
9
  pinned: false
10
  license: apache-2.0
11
  ---
 
12
 
13
+ # Evaluation of Orca 2 against other LLMs for Retrieval Augmented Generation
14
+
15
+ This project contains the source code, datasets and results for the titled paper.
16
 
17
  ## How it works
18
 
19
+ We're using an AI methodology, namely Retrieval Augmentated Generation (RAG), which uses LLMs off the shelf (i.e., without any fine-tuning), then controls their behavior through clever prompting and conditioning on private “contextual” data, e.g., texts extracted from your PDF files.
20
 
21
  At a very high level, the workflow can be divided into three stages:
22
 
 
26
 
27
  3. Prompt execution / inference: Once the prompts have been compiled, they are submitted to a pre-trained LLM for inference—including both proprietary model APIs and open-source or self-trained models.
28
 
29
+ ![Retrieval Augmentated Generation (RAG) - Workflow Overview](./assets/rag-workflow.png)
30
 
31
  Tech stack used includes LangChain, Gradio, Chroma and FAISS.
32
  - LangChain is an open-source framework that makes it easier to build scalable AI/LLM apps and chatbots.
 
102
  # LLM_MODEL_TYPE=gpt4all-j
103
  # LLM_MODEL_TYPE=gpt4all
104
  # LLM_MODEL_TYPE=llamacpp
105
+ LLM_MODEL_TYPE=huggingface
106
  # LLM_MODEL_TYPE=mosaicml
107
  # LLM_MODEL_TYPE=stablelm
108
  # LLM_MODEL_TYPE=openllm
109
+ # LLM_MODEL_TYPE=hftgi
110
  ```
111
 
112
  - By default, the app runs `microsoft/orca-2-13b` model with HF Text Generation Interface, which runs on a research server and might be down from time to time.
 
124
  # HUGGINGFACE_MODEL_NAME_OR_PATH="meta-llama/Llama-2-7b-chat-hf"
125
  # HUGGINGFACE_MODEL_NAME_OR_PATH="meta-llama/Llama-2-13b-chat-hf"
126
  # HUGGINGFACE_MODEL_NAME_OR_PATH="meta-llama/Llama-2-70b-chat-hf"
127
+ HUGGINGFACE_MODEL_NAME_OR_PATH="microsoft/Orca-2-7b"
128
+ # HUGGINGFACE_MODEL_NAME_OR_PATH="microsoft/Orca-2-13b"
129
  ```
130
 
assets/crag-workflow.png DELETED
Binary file (255 kB)
 
assets/rag-workflow.png ADDED
results/perf_data_nvidia_4080.xlsx DELETED
Binary file (5.87 kB)
 
results/raw_data_nvidia_4080.xlsx DELETED
Binary file (25.8 kB)