Spaces:

MarineLives
/

README

Running

App Files Files Community

Addaci commited on Sep 15, 2024

Commit

2436c93

verified ·

1 Parent(s): 5434208

Update README.md

Browse files

Files changed (1) hide show

README.md +28 -0

README.md CHANGED Viewed

@@ -73,6 +73,34 @@ HTR for the same tokens with page to page congruence, and broadly line by line c
    * Impact on readability of raw HTR + rules based Python script optimised to domain + different categories of fin-tuned small LLM machine adjustment
 **Integration of small LLMs with RAG pipeline**
 DATASETS

    * Impact on readability of raw HTR + rules based Python script optimised to domain + different categories of fin-tuned small LLM machine adjustment
 **Integration of small LLMs with RAG pipeline**
+We are exploring:
+Small RAG Systems
+Components:
+A small retriever (e.g., BM25, Sentence-BERT).
+A relatively lightweight LLM like mT5-small.
+A smaller corpus of documents or a curated thesaurus, perhaps stored in a simple format like JSON or SQLite.
+Deployment and Usage:
+Memory: Can run on GPUs with 8-16 GB VRAM, depending on the complexity of the documents and model size.
+Throughput: Fast but optimized for low-scale operations, such as handling small batches of queries.
+Cloud Hosting: Easily deployable on platforms like Hugging Face Spaces or a cloud service (AWS, GCP, Azure) using lightweight GPU instances.
+We are looking at Hugging Fac options:
+Hugging Face Spaces:
+Suitable for Prototypes: Spaces allow you to deploy small to medium models for free or at a low cost with CPU instances. You can also use GPU instances (such as T4 or A100) to host mT5 and experiment with RAG.
+Environment: Hugging Face Spaces uses Gradio or Streamlit interfaces, making it simple to build and share RAG applications.
+Scaling: This platform is ideal for prototyping and small-scale applications, but if you plan on scaling up (e.g., with large corpora or high-traffic queries), you may need a more robust infrastructure like AWS or GCP.
+Hugging Face Inference API:
+Using the Hugging Face Inference API to host models like mT5-small. This is a straightforward way to make API calls to the model for generation tasks. If you want to integrate a retriever with this API-based system, you would need to build that part separately (e.g., using an external document store or retriever).
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 DATASETS