Pash1986 commited on
Commit
0df4481
1 Parent(s): bf7b5a0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +111 -2
README.md CHANGED
@@ -1,6 +1,6 @@
1
  ---
2
  title: Mongodb Gemini Rag
3
- emoji: 📚
4
  colorFrom: indigo
5
  colorTo: purple
6
  sdk: gradio
@@ -10,4 +10,113 @@ pinned: false
10
  license: apache-2.0
11
  ---
12
 
13
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  title: Mongodb Gemini Rag
3
+ emoji: ♊️
4
  colorFrom: indigo
5
  colorTo: purple
6
  sdk: gradio
 
10
  license: apache-2.0
11
  ---
12
 
13
+ # Atlas Vector Search Chat with MongoDB and Google Gemini
14
+
15
+ Welcome to the Atlas Vector Search Chat! This application demonstrates how to use MongoDB [Atlas Vector Search](https://www.mongodb.com/docs/atlas/atlas-vector-search/vector-search-overview/) with [Google Gemini](https://ai.google.dev/) for semantic search and retrieval tasks.
16
+
17
+ ## Features
18
+ - **Interactive Chat**: Ask questions related to the embedded documents.
19
+ - **Vector Search**: Utilizes MongoDB Atlas Vector Search to find relevant documents based on similarity.
20
+ - **Google Gemini Integration**: Embeds text and generates responses.
21
+
22
+ ## Requirements
23
+ - Python 3.7 or later
24
+ - MongoDB Atlas account
25
+ - Atlas cluster enabled with `0.0.0.0/0` connection and connetion string
26
+ - Google Cloud account with access to Gemini
27
+
28
+ ## Installation
29
+ 1. **Clone the space**:
30
+ - Click [...] and clone the space to your repo, make sure to input the variables:
31
+
32
+ 3. **Set up environment variables**:
33
+ - `GOOGLE_API_KEY`: Your Google API key for Gemini.
34
+ - `MONGODB_ATLAS_URI`: Your MongoDB Atlas connection string.
35
+
36
+ ## Running the Application
37
+ 1. **Start the application**:
38
+ ```bash
39
+ python app.py
40
+ ```
41
+ 2. **Access the interface**:
42
+ Open your browser on the `App` tab.
43
+
44
+ ## Vector Search Index Configuration
45
+ To create a [vector search index](https://www.mongodb.com/docs/atlas/atlas-vector-search/create-index/) on the `google-ai.embedded_docs` collection, use the following configuration:
46
+ ```
47
+ {
48
+ "fields": [
49
+ {
50
+ "numDimensions": 768,
51
+ "path": "embedding",
52
+ "similarity": "cosine",
53
+ "type": "vector"
54
+ }
55
+ ]
56
+ }
57
+ ```
58
+ ## MongoDB Trigger to Embed Results
59
+
60
+ This Atlas enviroment use an Atlas Database trigger on collection `google-ai.embedded_docs` to capture any `insert` operation and embed the content as specified in this [article](https://www.mongodb.com/developer/products/atlas/semantic-search-mongodb-atlas-vector-search/).
61
+
62
+ ```
63
+ // Get the API key from Realm's Values & Secrets
64
+ const apiKey = context.values.get('google-api-key');
65
+
66
+ // Set up the URL for the Google Generative Language API - embedding endpoint
67
+ const url = `https://generativelanguage.googleapis.com/v1beta/models/embedding-001:embedContent?key=${apiKey}`;
68
+ // batch example
69
+ // const url = `https://generativelanguage.googleapis.com/v1beta/models/embedding-001:batchEmbedContents?key=${apiKey}`;
70
+
71
+ // Get the full document from the change event
72
+ const doc = changeEvent.fullDocument;
73
+
74
+ try {
75
+ console.log(`Processing document with id: ${doc._id}`);
76
+
77
+ // Prepare the request body
78
+ const requestBody = `{
79
+ "model": "models/embedding-001",
80
+ "content": {
81
+ "parts":[{
82
+ "text": '${doc.content}'}]}}`;
83
+
84
+
85
+ // Make the HTTP POST request
86
+ const response = await context.http.post({
87
+ url: url,
88
+ headers: { 'Content-Type': ['application/json'] },
89
+ body: requestBody
90
+ });
91
+
92
+ // Parse the JSON response
93
+ const responseData = EJSON.parse(response.body.text());
94
+ console.log(JSON.stringify(responseData))
95
+
96
+ if(response.statusCode === 200) {
97
+ console.log("Successfully received embedding response from the API.");
98
+
99
+ // Extract the embedding from the response
100
+ const embedding = responseData.embedding.values; // Adjust based on actual response structure
101
+
102
+ // Use the name of your MongoDB Atlas Cluster
103
+ const collection = context.services.get("mongodb-atlas").db("google-ai").collection("embedded_docs");
104
+
105
+ // Update the document in MongoDB with the embedding
106
+ const updateResult = await collection.updateOne(
107
+ { _id: doc._id },
108
+ { $set: { embedding: embedding }}
109
+ );
110
+
111
+ if(updateResult.modifiedCount === 1) {
112
+ console.log("Successfully updated the document.");
113
+ } else {
114
+ console.log("Failed to update the document.");
115
+ }
116
+ } else {
117
+ console.log(`Failed to receive embedding. Status code: ${response.statusCode} - ${JSON.stringify(response)}`);
118
+ }
119
+ } catch(err) {
120
+ console.error(`Error making request to API: ${err}`);
121
+ }
122
+ ```