Model Card: CodeRankEmbed (GGUF Quantized)

Model Overview

This model is a GGUF-quantized version of CodeRankEmbed.

The quantization reduces the model's size and computational requirements, facilitating efficient deployment without significantly compromising performance.

Model Details

  • Model Name: CodeRankEmbed-GGUF
  • Original Model: CodeRankEmbed
  • Quantization Format: GGUF
  • Parameters: 568 million
  • Embedding Dimension: 768
  • Languages Supported: Python, Java, JS, PHP, Go, Ruby
  • Context Length: Supports up to 8,192 tokens
  • License: MIT

Quantization Details

GGUF (Gerganov's General Unified Format) is a binary format optimized for efficient loading and inference of large language models. Quantization involves reducing the precision of the model's weights, resulting in decreased memory usage and faster computation with minimal impact on accuracy.

Performance

The CodeRankEmbed is a 137M bi-encoder supporting 8192 context length for code retrieval. It significantly outperforms various open-source and proprietary code embedding models on various code retrieval tasks.

Usage

This quantized model is suitable for deployment in resource-constrained environments where memory and computational efficiency are critical. It can be utilized for tasks such as code retrieval, semantic search, and other applications requiring high-quality code embeddings.

Limitations

While quantization reduces resource requirements, it may introduce slight degradation in model performance. Users should evaluate the model in their specific use cases to ensure it meets the desired performance criteria.

Acknowledgements

This quantized model is based on Nomic's CodeRankEmbed. For more details on the original model, please refer to the official model card.


For a overview of the CodeRankEmbed model, you may find the following article informative: https://simonwillison.net/2025/Mar/27/nomic-embed-code

Downloads last month
25
GGUF
Model size
137M params
Architecture
nomic-bert
Hardware compatibility
Log In to view the estimation

4-bit

5-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for limcheekin/CodeRankEmbed-GGUF

Quantized
(4)
this model