--- title: Mongodb Gemini Rag emoji: ♊️ colorFrom: indigo colorTo: purple sdk: gradio sdk_version: 4.31.5 app_file: app.py pinned: false license: apache-2.0 --- # Atlas Vector Search Chat with MongoDB and Google Gemini Welcome to the Atlas Vector Search Chat! This application demonstrates how to use MongoDB [Atlas Vector Search](https://www.mongodb.com/docs/atlas/atlas-vector-search/vector-search-overview/) with [Google Gemini](https://ai.google.dev/) for semantic search and retrieval tasks. ## Features - **Interactive Chat**: Ask questions related to the embedded documents. - **Vector Search**: Utilizes MongoDB Atlas Vector Search to find relevant documents based on similarity. - **Google Gemini Integration**: Embeds text and generates responses. ## Requirements - Python 3.7 or later - MongoDB Atlas account - Atlas cluster enabled with `0.0.0.0/0` connection and connetion string - Google Cloud account with access to Gemini ## Installation 1. **Clone the space**: - Click [...] and clone the space to your repo, make sure to input the variables: 3. **Set up environment variables**: - `GOOGLE_API_KEY`: Your Google API key for Gemini. - `MONGODB_ATLAS_URI`: Your MongoDB Atlas connection string. ## Running the Application 1. **Start the application**: ```bash python app.py ``` 2. **Access the interface**: Open your browser on the `App` tab. ## Vector Search Index Configuration To create a [vector search index](https://www.mongodb.com/docs/atlas/atlas-vector-search/create-index/) on the `google-ai.embedded_docs` collection, use the following configuration: ``` { "fields": [ { "numDimensions": 768, "path": "embedding", "similarity": "cosine", "type": "vector" } ] } ``` ## MongoDB Trigger to Embed Results This Atlas enviroment use an Atlas Database trigger on collection `google-ai.embedded_docs` to capture any `insert` operation and embed the content as specified in this [article](https://www.mongodb.com/developer/products/atlas/semantic-search-mongodb-atlas-vector-search/). ``` // Get the API key from Realm's Values & Secrets const apiKey = context.values.get('google-api-key'); // Set up the URL for the Google Generative Language API - embedding endpoint const url = `https://generativelanguage.googleapis.com/v1beta/models/embedding-001:embedContent?key=${apiKey}`; // batch example // const url = `https://generativelanguage.googleapis.com/v1beta/models/embedding-001:batchEmbedContents?key=${apiKey}`; // Get the full document from the change event const doc = changeEvent.fullDocument; try { console.log(`Processing document with id: ${doc._id}`); // Prepare the request body const requestBody = `{ "model": "models/embedding-001", "content": { "parts":[{ "text": '${doc.content}'}]}}`; // Make the HTTP POST request const response = await context.http.post({ url: url, headers: { 'Content-Type': ['application/json'] }, body: requestBody }); // Parse the JSON response const responseData = EJSON.parse(response.body.text()); console.log(JSON.stringify(responseData)) if(response.statusCode === 200) { console.log("Successfully received embedding response from the API."); // Extract the embedding from the response const embedding = responseData.embedding.values; // Adjust based on actual response structure // Use the name of your MongoDB Atlas Cluster const collection = context.services.get("mongodb-atlas").db("google-ai").collection("embedded_docs"); // Update the document in MongoDB with the embedding const updateResult = await collection.updateOne( { _id: doc._id }, { $set: { embedding: embedding }} ); if(updateResult.modifiedCount === 1) { console.log("Successfully updated the document."); } else { console.log("Failed to update the document."); } } else { console.log(`Failed to receive embedding. Status code: ${response.statusCode} - ${JSON.stringify(response)}`); } } catch(err) { console.error(`Error making request to API: ${err}`); } ```