Spaces:
Running
Running
File size: 5,250 Bytes
fb1a11c 713d673 e77bff1 0b212ec fb1a11c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 |
{
"notebook_title": "Retrieval-augmented generation (RAG)",
"notebook_type": "rag",
"dataset_types": ["text"],
"compatible_library": "pandas",
"notebook_template": [
{
"cell_type": "markdown",
"source": "---\n# **Retrieval-Augmented Generation Notebook for {dataset_name} dataset**\n---"
},
{
"cell_type": "markdown",
"source": "## 1. Setup necessary libraries and load the dataset"
},
{
"cell_type": "code",
"source": "# Install and import necessary libraries.\n!pip install pandas sentence-transformers faiss-cpu transformers torch huggingface_hub"
},
{
"cell_type": "code",
"source": "from sentence_transformers import SentenceTransformer\nfrom transformers import AutoModelForCausalLM, AutoTokenizer, pipeline\nfrom huggingface_hub import InferenceClient\nimport pandas as pd\nimport faiss\nimport torch"
},
{
"cell_type": "code",
"source": "# Load the dataset as a DataFrame\n{first_code}"
},
{
"cell_type": "code",
"source": "# Specify the column name that contains the text data to generate embeddings\ncolumn_to_generate_embeddings = '{longest_col}'"
},
{
"cell_type": "markdown",
"source": "## 2. Loading embedding model and creating FAISS index"
},
{
"cell_type": "code",
"source": "# Remove duplicate entries based on the specified column\ndf = df.drop_duplicates(subset=column_to_generate_embeddings)"
},
{
"cell_type": "code",
"source": "# Convert the column data to a list of text entries\ntext_list = df[column_to_generate_embeddings].tolist()"
},
{
"cell_type": "code",
"source": "# Specify the embedding model you want to use\nmodel = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')"
},
{
"cell_type": "code",
"source": "vectors = model.encode(text_list)\nvector_dimension = vectors.shape[1]\n\n# Initialize the FAISS index with the appropriate dimension (384 for this model)\nindex = faiss.IndexFlatL2(vector_dimension)\n\n# Encode the text list into embeddings and add them to the FAISS index\nindex.add(vectors)"
},
{
"cell_type": "markdown",
"source": "## 3. Perform a text search"
},
{
"cell_type": "code",
"source": "# Specify the text you want to search for in the list\nquery = \"How to cook sushi?\"\n\n# Generate the embedding for the search query\nquery_embedding = model.encode([query])"
},
{
"cell_type": "code",
"source": "# Perform the search to find the 'k' nearest neighbors (adjust 'k' as needed)\nD, I = index.search(query_embedding, k=10)\n\n# Print the similar documents found\nprint(f\"Similar documents: {[text_list[i] for i in I[0]]}\")"
},
{
"cell_type": "markdown",
"source": "## 4. Load pipeline and perform inference locally"
},
{
"cell_type": "code",
"source": "# Adjust model name as needed\ncheckpoint = 'HuggingFaceTB/SmolLM-1.7B-Instruct'\n\ndevice = \"cuda\" if torch.cuda.is_available() else \"cpu\" # for GPU usage or \"cpu\" for CPU usage\n\ntokenizer = AutoTokenizer.from_pretrained(checkpoint)\nmodel = AutoModelForCausalLM.from_pretrained(checkpoint).to(device)\n\ngenerator = pipeline(\"text-generation\", model=model, tokenizer=tokenizer, device=0 if device == \"cuda\" else -1)"
},
{
"cell_type": "code",
"source": "# Create a prompt with two parts: 'system' for instructions based on a 'context' from the retrieved documents, and 'user' for the query\nselected_elements = [text_list[i] for i in I[0].tolist()]\ncontext = ','.join(selected_elements)\nmessages = [\n {\n \"role\": \"system\",\n \"content\": f\"You are an intelligent assistant tasked with providing accurate and concise answers based on the following context. Use the information retrieved to construct your response. Context: {context}\",\n },\n {\"role\": \"user\", \"content\": query},\n]"
},
{
"cell_type": "code",
"source": "# Send the prompt to the pipeline and show the answer\noutput = generator(messages)\nprint(\"Generated result:\")\nprint(output[0]['generated_text'][-1]['content']) # Print the assistant's response content"
},
{
"cell_type": "markdown",
"source": "## 5. Alternatively call the inference client"
},
{
"cell_type": "code",
"source": "# Adjust model name as needed\ncheckpoint = \"meta-llama/Meta-Llama-3-8B-Instruct\"\n\n# Change here your Hugging Face API token\ntoken = \"hf_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\" \n\ninference_client = InferenceClient(checkpoint, token=token)\noutput = inference_client.chat_completion(messages=messages, stream=False)\nprint(\"Generated result:\")\nprint(output.choices[0].message.content)"
}
]
} |