dat-lequoc commited on
Commit
55d8151
1 Parent(s): 5e11a60

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +105 -6
README.md CHANGED
@@ -1,5 +1,5 @@
1
  ---
2
- base_model: Kortix/FastApply-1.5B-v1.0
3
  language:
4
  - en
5
  license: apache-2.0
@@ -8,15 +8,114 @@ tags:
8
  - transformers
9
  - unsloth
10
  - qwen2
11
- - gguf
 
 
 
12
  ---
13
 
14
- # Uploaded model
 
 
 
 
 
 
 
 
 
 
15
 
16
  - **Developed by:** Kortix
17
  - **License:** apache-2.0
18
- - **Finetuned from model :** Kortix/FastApply-1.5B-v1.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
 
20
- This qwen2 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
 
21
 
22
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
 
 
1
  ---
2
+ base_model: unsloth/qwen2.5-coder-1.5b-instruct-bnb-4bit
3
  language:
4
  - en
5
  license: apache-2.0
 
8
  - transformers
9
  - unsloth
10
  - qwen2
11
+ - trl
12
+ - sft
13
+ - fast-apply
14
+ - instant-apply
15
  ---
16
 
17
+ > **Remember to use `temperature = 0` for optimal results during inference.**
18
+
19
+ # FastApply-1.5B-v1.0
20
+
21
+ [Github: kortix-ai/fast-apply](https://github.com/kortix-ai/fast-apply)
22
+ [Dataset: Kortix/FastApply-dataset-v1.0](https://huggingface.co/datasets/Kortix/FastApply-dataset-v1.0)
23
+ [Try it now on 👉 Google Colab](https://colab.research.google.com/drive/1BNCab4oK-xBqwFQD4kCcjKc7BPKivkm1?usp=sharing)
24
+
25
+ ## Model Details
26
+
27
+ ### Basic Information
28
 
29
  - **Developed by:** Kortix
30
  - **License:** apache-2.0
31
+ - **Finetuned from model:** [unsloth/Qwen2.5-Coder-1.5B-Instruct-bnb-4bit](https://huggingface.co/unsloth/Qwen2.5-Coder-1.5B-Instruct-bnb-4bit)
32
+
33
+ ### Model Description
34
+
35
+ FastApply-1.5B-v1.0 is a 1.5B model designed for instant code application, producing full file edits to power [SoftGen AI](https://softgen.ai/).
36
+ It is part of the Fast Apply pipeline for data generation and fine-tuning Qwen2.5 Coder models.
37
+
38
+ The model achieves high throughput when deployed on fast providers like Fireworks while maintaining high edit accuracy, with a speed of approximately 340 tokens/second.
39
+
40
+ ## Intended Use
41
+
42
+ FastApply-1.5B-v1.0 is intended for use in AI-powered code editors and tools that require fast, accurate code modifications. It is particularly well-suited for:
43
+
44
+ - Instant code application tasks
45
+ - Full file edits
46
+ - Integration with AI-powered code editors like Aider and PearAI
47
+ - Local tools to reduce the cost of frontier model output
48
+
49
+ ## Inference template
50
+
51
+ FastApply-1.5B-v1.0 is based on the Qwen2.5 Coder architecture and is fine-tuned for code editing tasks. It uses a specific prompt structure for inference:
52
+
53
+ ```
54
+ <|im_start|>system
55
+ You are a coding assistant that helps merge code updates, ensuring every modification is fully integrated.<|im_end|>
56
+ <|im_start|>user
57
+ Merge all changes from the <update> snippet into the <code> below.
58
+ - Preserve the code's structure, order, comments, and indentation exactly.
59
+ - Output only the updated code, enclosed within <updated-code> and </updated-code> tags.
60
+ - Do not include any additional text, explanations, placeholders, ellipses, or code fences.
61
+
62
+ <code>{original_code}</code>
63
+
64
+ <update>{update_snippet}</update>
65
+
66
+ Provide the complete updated code.<|im_end|>
67
+ <|im_start|>assistant
68
+ ```
69
+
70
+ The model's output is structured as:
71
+
72
+ ```
73
+ <updated-code>[Full-complete updated file]</updated-code>
74
+ ```
75
+
76
+ ## Additional Information
77
+
78
+ For more details on the Fast Apply pipeline, data generation process, and deployment instructions, please refer to the [GitHub repository](https://github.com/kortix-ai/fast-apply).
79
+
80
+ ## How to Use
81
+
82
+ To use the model, you can load it using the Hugging Face Transformers library:
83
+
84
+ ```python
85
+ from transformers import AutoModelForCausalLM, AutoTokenizer
86
+
87
+ model = AutoModelForCausalLM.from_pretrained("Kortix/FastApply-1.5B-v1.0", device_map="auto")
88
+ tokenizer = AutoTokenizer.from_pretrained("Kortix/FastApply-1.5B-v1.0")
89
+
90
+ # Prepare your input following the prompt structure mentioned above
91
+ input_text = """<|im_start|>system
92
+ You are a coding assistant that helps merge code updates, ensuring every modification is fully integrated.<|im_end|>
93
+ <|im_start|>user
94
+ Merge all changes from the <update> snippet into the <code> below.
95
+ - Preserve the code's structure, order, comments, and indentation exactly.
96
+ - Output only the updated code, enclosed within <updated-code> and </updated-code> tags.
97
+ - Do not include any additional text, explanations, placeholders, ellipses, or code fences.
98
+
99
+ <code>{original_code}</code>
100
+
101
+ <update>{update_snippet}</update>
102
+
103
+ Provide the complete updated code.<|im_end|>
104
+ <|im_start|>assistant
105
+ """
106
+
107
+ input_text = input_text.format(
108
+ original_code=original_code,
109
+ update_snippet=update_snippet,
110
+ ).strip()
111
+
112
+ # Generate the response
113
+ input_ids = tokenizer.encode(input_text, return_tensors="pt")
114
+ output = model.generate(input_ids, max_length=8192,)
115
 
116
+ response = tokenizer.decode(output[0][len(input_ids[0]):])
117
+ print(response)
118
 
119
+ # Extract the updated code from the response
120
+ updated_code = response.split("<updated-code>")[1].split("</updated-code>")[0]
121
+ ```