File size: 4,231 Bytes
ee38225
 
0df4481
ee38225
 
 
 
 
 
 
 
 
0df4481
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
---
title: Mongodb Gemini Rag
emoji: ♊️
colorFrom: indigo
colorTo: purple
sdk: gradio
sdk_version: 4.31.5
app_file: app.py
pinned: false
license: apache-2.0
---

# Atlas Vector Search Chat with MongoDB and Google Gemini

Welcome to the Atlas Vector Search Chat! This application demonstrates how to use MongoDB [Atlas Vector Search](https://www.mongodb.com/docs/atlas/atlas-vector-search/vector-search-overview/) with [Google Gemini](https://ai.google.dev/) for semantic search and retrieval tasks.

## Features
- **Interactive Chat**: Ask questions related to the embedded documents.
- **Vector Search**: Utilizes MongoDB Atlas Vector Search to find relevant documents based on similarity.
- **Google Gemini Integration**: Embeds text and generates responses.

## Requirements
- Python 3.7 or later
- MongoDB Atlas account
-   Atlas cluster enabled with `0.0.0.0/0` connection and connetion string
- Google Cloud account with access to Gemini

## Installation
1. **Clone the space**:
    - Click [...] and clone the space to your repo, make sure to input the variables: 

3. **Set up environment variables**:
    - `GOOGLE_API_KEY`: Your Google API key for Gemini.
    - `MONGODB_ATLAS_URI`: Your MongoDB Atlas connection string.

## Running the Application
1. **Start the application**:
    ```bash
    python app.py
    ```
2. **Access the interface**:
    Open your browser on the `App` tab.

## Vector Search Index Configuration
To create a [vector search index](https://www.mongodb.com/docs/atlas/atlas-vector-search/create-index/) on the `google-ai.embedded_docs` collection, use the following configuration:
```
{
  "fields": [
    {
      "numDimensions": 768,
      "path": "embedding",
      "similarity": "cosine",
      "type": "vector"
    }
  ]
}
```
## MongoDB Trigger to Embed Results

This Atlas enviroment use an Atlas Database trigger on collection `google-ai.embedded_docs` to capture any `insert` operation and embed the content as specified in this [article](https://www.mongodb.com/developer/products/atlas/semantic-search-mongodb-atlas-vector-search/).

```
// Get the API key from Realm's Values & Secrets
const apiKey = context.values.get('google-api-key');

// Set up the URL for the Google Generative Language API - embedding endpoint
const url = `https://generativelanguage.googleapis.com/v1beta/models/embedding-001:embedContent?key=${apiKey}`;
// batch example
// const url = `https://generativelanguage.googleapis.com/v1beta/models/embedding-001:batchEmbedContents?key=${apiKey}`;

// Get the full document from the change event
const doc = changeEvent.fullDocument;

try {
    console.log(`Processing document with id: ${doc._id}`);

    // Prepare the request body
    const requestBody = `{
    "model": "models/embedding-001",
    "content": {
    "parts":[{
      "text": '${doc.content}'}]}}`;
    

    // Make the HTTP POST request
    const response = await context.http.post({
        url: url,
        headers: { 'Content-Type': ['application/json'] },
        body: requestBody
    });

    // Parse the JSON response
    const responseData = EJSON.parse(response.body.text());
    console.log(JSON.stringify(responseData))

    if(response.statusCode === 200) {
        console.log("Successfully received embedding response from the API.");

        // Extract the embedding from the response
        const embedding = responseData.embedding.values; // Adjust based on actual response structure

        // Use the name of your MongoDB Atlas Cluster
        const collection = context.services.get("mongodb-atlas").db("google-ai").collection("embedded_docs");

        // Update the document in MongoDB with the embedding
        const updateResult = await collection.updateOne(
            { _id: doc._id },
            { $set: { embedding: embedding }}
        );

        if(updateResult.modifiedCount === 1) {
            console.log("Successfully updated the document.");
        } else {
            console.log("Failed to update the document.");
        }
    } else {
        console.log(`Failed to receive embedding. Status code: ${response.statusCode} -  ${JSON.stringify(response)}`);
    }
} catch(err) {
    console.error(`Error making request to API: ${err}`);
}
```