Model Card: APEX-Embedding-7B [NON-COMMERCIAL USE ONLY]
Fifth Dimension
Read our paper on ArXiv here: https://arxiv.org/abs/2410.18105
Model Overview
APEX-Embedding-7B is a 7-billion parameter model optimized for Factual Document Retrieval in Retrieval-Augmented Generation (RAG) systems. During training, the model was enhanced using Structured Entity Relationship Maps and Model-Aware Contrastive Sampling to focus on factual accuracy. The final model is highly effective for generating precise text embeddings, especially for industries that rely on large-scale document retrieval tasks such as legal, compliance, and real estate.
This model achieves a 90.86% in rank@1 accuracy during our document retrieval evaluation compared to similar models, ensuring reliable and accurate retrieval of relevant documents from large datasets.
License & Disclaimer
License: Creative Commons Attribution Non Commercial 4.0 License
This release is for non-commercial research purposes only, and has been published in support of an academic paper. The model has been fine-tuned for real-world Factual RAG tasks and has not been designed or evaluated for all embedding tasks, such as those found in the MTEB or other benchmark datasets. This model is provided 'as is' without any warranties and the authors are not to be held responsible for any consequences resulting from its use. Users are responsible for ensuring both their legal compliance with the license terms, local laws, and the model's technical suitability for their specific use case.
How to Use
The following steps guide you on how to generate embeddings and compare the similarity between queries and documents.
Environment Setup
First, install the necessary libraries:
pip install torch transformers peft accelerate numpy
pip install -U bitsandbytes
Loading the Model
Here is the code to load the model:
(There is a pre-quantised version of the base model available in the base
directory)
from transformers import AutoModel, AutoTokenizer, BitsAndBytesConfig
import torch
model_path = "5DAI/APEX-Embedding-7B-v0.1"
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
quantization_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16
)
model = AutoModel.from_pretrained(
model_path,
quantization_config=quantization_config
)
tokenizer = AutoTokenizer.from_pretrained(model_path)
model.to(device)
Generating Embeddings for Queries and Documents
When generating embeddings, use the following instruction prompt for both queries and documents:
def get_embedding(text: str, model, tokenizer):
prompt = f"Instruction: Please perform a RAG search based on the following. Text: {text}"
inputs = tokenizer(prompt, return_tensors="pt", max_length=8192, padding=True, truncation=True).to(device)
with torch.no_grad():
outputs = model(**inputs)
embedding = outputs.last_hidden_state[:, -1, :] # Use last token pooling
embedding = torch.nn.functional.normalize(embedding, p=2, dim=1)
return embedding.cpu().numpy()
Context Window extends to 32K tokens, but best results can be achieved if text is chunked by page (< 8192 tokens).
Cosine Similarity for Queries and Documents
To compare a query and a document, use the cosine similarity function:
import numpy as np
def cosine_similarity(vecA: np.ndarray, vecB: np.ndarray) -> float:
normA = np.linalg.norm(vecA)
normB = np.linalg.norm(vecB)
return np.dot(vecA, vecB) / (normA * normB) if normA > 0 and normB > 0 else 0
Example: Query vs. Document Embedding
Here’s how to generate embeddings for a query and a document, and compare their similarity:
query = "What are the legal requirements for property zoning in urban areas?"
document = "This document contains details about urban property zoning laws, including legal frameworks and compliance standards."
query_embedding = get_embedding(query, model, tokenizer)[0]
document_embedding = get_embedding(document, model, tokenizer)[0]
similarity = cosine_similarity(query_embedding, document_embedding)
print(f"Cosine similarity between query and document: {similarity}")
Citation
Please cite APEX-Embedding-7B as follows:
@misc{APEX-embedding-7b,
title={APEX-Embedding-7B: Improving Embedding Accuracy for Document Retrieval Using Entity Relationship Maps and Model-Aware Contrastive Sampling},
author={Thea Aviss},
year={2024},
url={https://arxiv.org/abs/2410.18105}
}
Model tree for 5DAI/APEX-Embedding-7B-v0.1
Base model
Salesforce/SFR-Embedding-2_R