Kwaipilot KwaiCoder-DS-V2-Lite-Base
1.Model Details
Introduction
Kwai-Coder-DS-V2-Lite-Base is built on Deepseek-v2-Lite-Base, which has a total of 16B parameters and 2.4B activated parameters. It supports both English and Chinese and underwent continue pretraining on 800B tokens of high-quality code, math, and Chinese-English text data. The training data consists of 70% code data, 20% math data, and 10% text data (including a large amount of code-related text data). Ultimately, the base model achieved SOTA levels in multiple benchmarks.
Performance
Model | Size | Humaneval | Humaneval+ | MBPP | MBPP+ | BigCodeBench(Full) | BigCodeBench(Hard) | MATH | GSM8k |
---|---|---|---|---|---|---|---|---|---|
Qwen2.5-Coder | 1.5B | 43.9 | 36.6 | 69.2 | 58.6 | 34.6 | 9.5 | 30.9 | 65.8 |
CodeGemma | 2B | 31.1 | 16.5 | 51.1 | 43.1 | 23.9 | 7.4 | - | - |
CodeLlama | 7B | 33.5 | 26.2 | 55.3 | 46.8 | 28.7 | 5.4 | 12.1 | 31.2 |
Qwen2.5-Coder | 7B | 46.3 | 37.8 | 66.2 | 53.1 | 38.4 | 12.2 | 46.6 | 83.9 |
OpenCoder | 8B | 66.5 | 63.4 | 79.9 | 70.4 | 40.5 | 9.5 | - | - |
Yi-Coder | 9B | 53.7 | 46.3 | 48.4 | 40.7 | 42.9 | 14.2 | - | - |
StarCoder2 | 15B | 46.3 | 37.8 | 66.2 | 53.1 | 38.4 | 12.2 | 10.3 | 23.4 |
DeepSeek-Coder-V2-Lite | 16B | 40.9 | 34.1 | 71.9 | 59.4 | 30.6 | 8.1 | 39.0 | 67.1 |
KwaiCoder-DS-V2-Lite | 16B | 75.0 | 68.9 | 81.2 | 67.7 | 49.4 | 18.2 | 40.48 | 81.5 |
CodeLlama | 34B | 51.8 | 43.9 | 69.3 | 56.3 | 45.3 | 16.2 | 21.2 | 58.2 |
Kwai-Coder-DS-V2-Lite-Base achieved Pass@1 scores of 75.0% and 68.9% on the HumanEval and HumanEval+ test sets, respectively. Compared to Deepseek-v2-Lite-Base of the same parameter scale, this represents an improvement of 83.37% and 102.05%, respectively. Additionally, it surpassed the current best base model (OpenCoder-8B), reaching SOTA (State-of-the-Art) levels.
On the MBPP and MBPP+ test sets, Kwai-Coder-DS-V2-Lite-Base outperformed the Deepseek-v2-Lite-Base model of the same parameter scale. Additionally, with only 2.4B activated parameters, the Kwai-Coder-DS-V2-Lite-Base model achieved an average improvement of nearly 5 percentage points compared to the 7B parameter-scale Qwen2.5-Coder.
On the BigCodeBench-Complete full set (Full), Kwai-Coder-DS-V2-Lite-Base achieved a 6% improvement over DeepSeek-Coder-33B, reaching SOTA (State-of-the-Art) levels. On the Hard subset, Kwai-Coder-DS-V2-Lite-Base also significantly outperformed the 70B parameter-scale CodeLlama model and the 7B parameter-scale Qwen2.5-Coder model.
In terms of mathematical capabilities, with only 2.4B activated parameters, Kwai-Coder-DS-V2-Lite-Base surpassed Deepseek-v2-Lite-Base of the same parameter scale on the MATH and GSM8K test sets (with improvements of 3.79% and 21.46%, respectively) and outperformed the larger parameter-scale CodeLlama-34B (with improvements of 90.95% and 40.03%, respectively). Although it has not yet exceeded Qwen2.5-Coder-7B, Kwai-Coder-DS-V2-Lite-Base has already surpassed the Qwen2.5-Coder-3B model, which has more activated parameters, achieving SOTA (State-of-the-Art) levels for its parameter scale.
2.Usage
Code Completion
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_id = "Kwaipilot/KwaiCoder-DS-V2-Lite-Base"
tokenizer = AutoTokenizer.from_pretrained(model_id,trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", torch_dtype=torch.bfloat16,trust_remote_code=True)
text = "#write a quick sort algorithm"
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=80)
print(tokenizer.decode(outputs[0], skip_special_tokens=True)[len(text):])
Code Insertion
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_id = "Kwaipilot/KwaiCoder-DS-V2-Lite-Base"
tokenizer = AutoTokenizer.from_pretrained(model_id,trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", torch_dtype=torch.bfloat16,trust_remote_code=True)
text = """<|fim▁begin|>def find_longest_substring(s):
seen = {}
max_length = 0
start = 0
<|fim▁hole|>
if char in seen and seen[char] >= start:
start = seen[char] + 1
seen[char] = end
max_length = max(max_length, end - start + 1)
return max_length<|fim▁end|>"""
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=80)
print(tokenizer.decode(outputs[0], skip_special_tokens=True)[len(text):])
3.License
This code repository is licensed under the MIT License. The use of KwaiCoder-DS-V2-Lite-Base models is subject to the Model License.
4.BibTex
@misc{kwaicoder,
title = {KwaiCoder: Code mathematical abilities comprehensive improvement.},
author = {Kwaipilot team},
year = {2024},
}
- Downloads last month
- 13,469