Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
@@ -73,6 +73,34 @@ HTR for the same tokens with page to page congruence, and broadly line by line c
|
|
73 |
* Impact on readability of raw HTR + rules based Python script optimised to domain + different categories of fin-tuned small LLM machine adjustment
|
74 |
|
75 |
**Integration of small LLMs with RAG pipeline**
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
76 |
|
77 |
DATASETS
|
78 |
|
|
|
73 |
* Impact on readability of raw HTR + rules based Python script optimised to domain + different categories of fin-tuned small LLM machine adjustment
|
74 |
|
75 |
**Integration of small LLMs with RAG pipeline**
|
76 |
+
|
77 |
+
We are exploring:
|
78 |
+
|
79 |
+
Small RAG Systems
|
80 |
+
|
81 |
+
Components:
|
82 |
+
A small retriever (e.g., BM25, Sentence-BERT).
|
83 |
+
A relatively lightweight LLM like mT5-small.
|
84 |
+
A smaller corpus of documents or a curated thesaurus, perhaps stored in a simple format like JSON or SQLite.
|
85 |
+
|
86 |
+
Deployment and Usage:
|
87 |
+
Memory: Can run on GPUs with 8-16 GB VRAM, depending on the complexity of the documents and model size.
|
88 |
+
Throughput: Fast but optimized for low-scale operations, such as handling small batches of queries.
|
89 |
+
Cloud Hosting: Easily deployable on platforms like Hugging Face Spaces or a cloud service (AWS, GCP, Azure) using lightweight GPU instances.
|
90 |
+
|
91 |
+
We are looking at Hugging Fac options:
|
92 |
+
|
93 |
+
Hugging Face Spaces:
|
94 |
+
|
95 |
+
Suitable for Prototypes: Spaces allow you to deploy small to medium models for free or at a low cost with CPU instances. You can also use GPU instances (such as T4 or A100) to host mT5 and experiment with RAG.
|
96 |
+
Environment: Hugging Face Spaces uses Gradio or Streamlit interfaces, making it simple to build and share RAG applications.
|
97 |
+
Scaling: This platform is ideal for prototyping and small-scale applications, but if you plan on scaling up (e.g., with large corpora or high-traffic queries), you may need a more robust infrastructure like AWS or GCP.
|
98 |
+
|
99 |
+
Hugging Face Inference API:
|
100 |
+
|
101 |
+
Using the Hugging Face Inference API to host models like mT5-small. This is a straightforward way to make API calls to the model for generation tasks. If you want to integrate a retriever with this API-based system, you would need to build that part separately (e.g., using an external document store or retriever).
|
102 |
+
|
103 |
+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
104 |
|
105 |
DATASETS
|
106 |
|