|
|
|
--- |
|
|
|
base_model: unsloth/qwen2.5-coder-7b-instruct-bnb-4bit |
|
language: |
|
- en |
|
license: apache-2.0 |
|
tags: |
|
- text-generation-inference |
|
- transformers |
|
- unsloth |
|
- qwen2 |
|
- trl |
|
- sft |
|
- fast-apply |
|
- instant-apply |
|
|
|
--- |
|
|
|
[![QuantFactory Banner](https://lh7-rt.googleusercontent.com/docsz/AD_4nXeiuCm7c8lEwEJuRey9kiVZsRn2W-b4pWlu3-X534V3YmVuVc2ZL-NXg2RkzSOOS2JXGHutDuyyNAUtdJI65jGTo8jT9Y99tMi4H4MqL44Uc5QKG77B0d6-JfIkZHFaUA71-RtjyYZWVIhqsNZcx8-OMaA?key=xt3VSDoCbmTY7o-cwwOFwQ)](https://hf.co/QuantFactory) |
|
|
|
|
|
# QuantFactory/FastApply-7B-v1.0-GGUF |
|
This is quantized version of [Kortix/FastApply-7B-v1.0](https://huggingface.co/Kortix/FastApply-7B-v1.0) created using llama.cpp |
|
|
|
# Original Model Card |
|
|
|
|
|
|
|
# FastApply-7B-v1.0 |
|
|
|
[Github: kortix-ai/fast-apply](https://github.com/kortix-ai/fast-apply) |
|
[Dataset: Kortix/FastApply-dataset-v1.0](https://huggingface.co/datasets/Kortix/FastApply-dataset-v1.0) |
|
[Try it now on ๐ Google Colab](https://colab.research.google.com/drive/1aBqM8Lqso0Xfgtr75G4LFQivXcChU_36?usp=sharing) |
|
|
|
## Model Details |
|
|
|
### Basic Information |
|
|
|
- **Developed by:** Kortix |
|
- **License:** apache-2.0 |
|
- **Finetuned from model:** [unsloth/Qwen2.5-Coder-7B-Instruct-bnb-4bit](https://huggingface.co/unsloth/Qwen2.5-Coder-7B-Instruct-bnb-4bit) |
|
|
|
### Model Description |
|
|
|
FastApply-7B-v1.0 is a 7B model designed for instant code application, producing full file edits to power [SoftGen AI](https://softgen.ai/). |
|
It is part of the Fast Apply pipeline for data generation and fine-tuning Qwen2.5 Coder models. |
|
|
|
The model achieves high throughput when deployed on fast providers like Fireworks while maintaining high edit accuracy, with a speed of approximately 150 tokens/second. |
|
|
|
## Intended Use |
|
|
|
FastApply-7B-v1.0 is intended for use in AI-powered code editors and tools that require fast, accurate code modifications. It is particularly well-suited for: |
|
|
|
- Instant code application tasks |
|
- Full file edits |
|
- Integration with AI-powered code editors like Aider and PearAI |
|
- Local tools to reduce the cost of frontier model output |
|
|
|
## Inference template |
|
|
|
FastApply-7B-v1.0 is based on the Qwen2.5 Coder architecture and is fine-tuned for code editing tasks. It uses a specific prompt structure for inference: |
|
|
|
``` |
|
<|im_start|>system |
|
You are a coding assistant that helps merge code updates, ensuring every modification is fully integrated.<|im_end|> |
|
<|im_start|>user |
|
Merge all changes from the <update> snippet into the <code> below. |
|
- Preserve the code's structure, order, comments, and indentation exactly. |
|
- Output only the updated code, enclosed within <updated-code> and </updated-code> tags. |
|
- Do not include any additional text, explanations, placeholders, ellipses, or code fences. |
|
|
|
<code>{original_code}</code> |
|
|
|
<update>{update_snippet}</update> |
|
|
|
Provide the complete updated code.<|im_end|> |
|
<|im_start|>assistant |
|
``` |
|
|
|
The model's output is structured as: |
|
|
|
``` |
|
<updated-code>[Full-complete updated file]</updated-code> |
|
``` |
|
|
|
## Additional Information |
|
|
|
For more details on the Fast Apply pipeline, data generation process, and deployment instructions, please refer to the [GitHub repository](https://github.com/kortix-ai/fast-apply). |
|
|
|
## How to Use |
|
|
|
To use the model, you can load it using the Hugging Face Transformers library: |
|
|
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
model = AutoModelForCausalLM.from_pretrained("Kortix/FastApply-7B-v1.0") |
|
tokenizer = AutoTokenizer.from_pretrained("Kortix/FastApply-7B-v1.0") |
|
|
|
# Prepare your input following the prompt structure mentioned above |
|
input_text = """<|im_start|>system |
|
You are a coding assistant that helps merge code updates, ensuring every modification is fully integrated.<|im_end|> |
|
<|im_start|>user |
|
Merge all changes from the <update> snippet into the <code> below. |
|
- Preserve the code's structure, order, comments, and indentation exactly. |
|
- Output only the updated code, enclosed within <updated-code> and </updated-code> tags. |
|
- Do not include any additional text, explanations, placeholders, ellipses, or code fences. |
|
|
|
<code>{original_code}</code> |
|
|
|
<update>{update_snippet}</update> |
|
|
|
Provide the complete updated code.<|im_end|> |
|
<|im_start|>assistant |
|
""" |
|
|
|
input_text = input_text.format( |
|
original_code=original_code, |
|
update_snippet=update_snippet, |
|
).strip() |
|
|
|
# Generate the response |
|
input_ids = tokenizer.encode(input_text, return_tensors="pt") |
|
output = model.generate(input_ids, max_length=8192,) |
|
|
|
response = tokenizer.decode(output[0][len(input_ids[0]):]) |
|
print(response) |
|
|
|
# Extract the updated code from the response |
|
updated_code = response.split("<updated-code>")[1].split("</updated-code>")[0] |
|
``` |
|
|
|
|
|
## Evaluation: |
|
|
|
![image/png](https://cdn-uploads.huggingface.co/production/uploads/650d7ecb23e8028a8970a203/_E6WVzuVABKB58QMx6c1c.png) |
|
|
|
|