Model Card: CodeRankEmbed (GGUF Quantized)
Model Overview
This model is a GGUF-quantized version of CodeRankEmbed.
The quantization reduces the model's size and computational requirements, facilitating efficient deployment without significantly compromising performance.
Model Details
- Model Name: CodeRankEmbed-GGUF
- Original Model: CodeRankEmbed
- Quantization Format: GGUF
- Parameters: 568 million
- Embedding Dimension: 768
- Languages Supported: Python, Java, JS, PHP, Go, Ruby
- Context Length: Supports up to 8,192 tokens
- License: MIT
Quantization Details
GGUF (Gerganov's General Unified Format) is a binary format optimized for efficient loading and inference of large language models. Quantization involves reducing the precision of the model's weights, resulting in decreased memory usage and faster computation with minimal impact on accuracy.
Performance
The CodeRankEmbed is a 137M bi-encoder supporting 8192 context length for code retrieval. It significantly outperforms various open-source and proprietary code embedding models on various code retrieval tasks.
Usage
This quantized model is suitable for deployment in resource-constrained environments where memory and computational efficiency are critical. It can be utilized for tasks such as code retrieval, semantic search, and other applications requiring high-quality code embeddings.
Limitations
While quantization reduces resource requirements, it may introduce slight degradation in model performance. Users should evaluate the model in their specific use cases to ensure it meets the desired performance criteria.
Acknowledgements
This quantized model is based on Nomic's CodeRankEmbed. For more details on the original model, please refer to the official model card.
For a overview of the CodeRankEmbed model, you may find the following article informative: https://simonwillison.net/2025/Mar/27/nomic-embed-code
- Downloads last month
- 25
4-bit
5-bit
8-bit
16-bit
Model tree for limcheekin/CodeRankEmbed-GGUF
Base model
Snowflake/snowflake-arctic-embed-m-long