Addaci commited on
Commit
2436c93
1 Parent(s): 5434208

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +28 -0
README.md CHANGED
@@ -73,6 +73,34 @@ HTR for the same tokens with page to page congruence, and broadly line by line c
73
  * Impact on readability of raw HTR + rules based Python script optimised to domain + different categories of fin-tuned small LLM machine adjustment
74
 
75
  **Integration of small LLMs with RAG pipeline**
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
76
 
77
  DATASETS
78
 
 
73
  * Impact on readability of raw HTR + rules based Python script optimised to domain + different categories of fin-tuned small LLM machine adjustment
74
 
75
  **Integration of small LLMs with RAG pipeline**
76
+
77
+ We are exploring:
78
+
79
+ Small RAG Systems
80
+
81
+ Components:
82
+ A small retriever (e.g., BM25, Sentence-BERT).
83
+ A relatively lightweight LLM like mT5-small.
84
+ A smaller corpus of documents or a curated thesaurus, perhaps stored in a simple format like JSON or SQLite.
85
+
86
+ Deployment and Usage:
87
+ Memory: Can run on GPUs with 8-16 GB VRAM, depending on the complexity of the documents and model size.
88
+ Throughput: Fast but optimized for low-scale operations, such as handling small batches of queries.
89
+ Cloud Hosting: Easily deployable on platforms like Hugging Face Spaces or a cloud service (AWS, GCP, Azure) using lightweight GPU instances.
90
+
91
+ We are looking at Hugging Fac options:
92
+
93
+ Hugging Face Spaces:
94
+
95
+ Suitable for Prototypes: Spaces allow you to deploy small to medium models for free or at a low cost with CPU instances. You can also use GPU instances (such as T4 or A100) to host mT5 and experiment with RAG.
96
+ Environment: Hugging Face Spaces uses Gradio or Streamlit interfaces, making it simple to build and share RAG applications.
97
+ Scaling: This platform is ideal for prototyping and small-scale applications, but if you plan on scaling up (e.g., with large corpora or high-traffic queries), you may need a more robust infrastructure like AWS or GCP.
98
+
99
+ Hugging Face Inference API:
100
+
101
+ Using the Hugging Face Inference API to host models like mT5-small. This is a straightforward way to make API calls to the model for generation tasks. If you want to integrate a retriever with this API-based system, you would need to build that part separately (e.g., using an external document store or retriever).
102
+
103
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
104
 
105
  DATASETS
106