Model Card for CodeFuse-13B

logo

[中文] [English]

Model Description

CodeFuse-13B is a 13 billion parameter code generation model trained on the GPT-NeoX framework, capable of handling code sequences of up to 4096 characters. This model was pretrained on a dataset consisting of 1000B token code, Chinese, and English data, covering over 40 programming languages. To further enhance the effectiveness and quality of the generated code, the model was fine-tuned on the CodeFuse-Evol-instruction-66k dataset, enabling it to produce more accurate, efficient, and compliant code. Pass@1 achieved 37.1% on the HumanEval evaluation set(BeamSearch strategy, BeamSize=3).

Code Community

Homepage: 🏡 https://github.com/codefuse-ai (Please give us your support with a Star🌟 + Fork🚀 + Watch👀)

  • If you wish to fine-tune the model yourself, you can visit ✨MFTCoder✨✨

  • If you wish to deploy the model yourself, you can visit ✨FasterTransformer4CodeFuse✨✨

  • If you wish to see a demo of the model, you can visit ✨CodeFuse Demo✨✨

Requirements

  • Python 3.8 or above.
  • PyTorch 1.12 or above, with a recommendation for 2.0 or above.
  • Transformers 4.24.0 or above.
  • It is advisable to use CUDA 11.4 or above (for GPU users and flash-attention users, this option should be considered).

Quickstart

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained(("CodeFuse-13B"))
model = AutoModelForCausalLM.from_pretrained(("CodeFuse-13B"), device_map="auto").half().eval()

input_ids = tokenizer.encode("# language: Python\ndef quick_sort(array):\n", return_tensors="pt").to("cuda")
output_ids = model.generate(input_ids, max_new_tokens=200)

print(tokenizer.decode(output_ids[0]))

MD5

We notice that the file may be corrupted during transfer process. Please check MD5 value before use.

Model File MD5 Value
pytorch_model-00001-of-00006.bin b79e4ccc93c40fa6113aaf6a434473d5
pytorch_model-00002-of-00006.bin 5a82f19e3f62c693e41fe627084c722b
pytorch_model-00003-of-00006.bin d4b53c391a353d0fc0a1be1c913d5f04
pytorch_model-00004-of-00006.bin f9e3dcdea13ff02f4e3aad4f9db7a33f
pytorch_model-00005-of-00006.bin 698a8f2f05723a572193733bce12eb93
pytorch_model-00006-of-00006.bin 312439d0b810f1bb81034fe094ff84c7

简介

CodeFuse-13B是基于GPT-NeoX框架训练的13B参数代码生成模型,能够处理4096个字符的代码序列。该模型在1000B Token的代码、中文、英文数据数据集上进行预训练,覆盖超过40种编程语言。为了进一步提升生成代码的效果和质量,该模型还在CodeFuse-Evol-instruction-66k数据集上进行了微调,使得该模型能够生成更加准确、高效、符合要求的代码。在HumanEval评测集上Pass@1达到37.1%(采用BeamSearch解码,其中BeamSize=3)。

代码社区

大本营: 🏡 https://github.com/codefuse-ai欢迎为我们的项目一键三连 Star🌟 + Fork🚀 + Watch👀

要求

  • python 3.8及以上版本
  • pytorch 1.12及以上版本,推荐2.0及以上版本
  • transformers 4.24.0及以上版本
  • 建议使用CUDA 11.4及以上(GPU用户、flash-attention用户等需考虑此选项)。

快速使用

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained(("CodeFuse-13B"))
model = AutoModelForCausalLM.from_pretrained(("CodeFuse-13B"), device_map="auto").half().eval()

input_ids = tokenizer.encode("# language: Python\ndef quick_sort(array):\n", return_tensors="pt").to("cuda")
output_ids = model.generate(input_ids, max_new_tokens=200)

print(tokenizer.decode(output_ids[0]))

MD5

我们发现模型文件可能会在传输过程中损坏,使用前请检查文件MD5值。

模型文件 MD5值
pytorch_model-00001-of-00006.bin b79e4ccc93c40fa6113aaf6a434473d5
pytorch_model-00002-of-00006.bin 5a82f19e3f62c693e41fe627084c722b
pytorch_model-00003-of-00006.bin d4b53c391a353d0fc0a1be1c913d5f04
pytorch_model-00004-of-00006.bin f9e3dcdea13ff02f4e3aad4f9db7a33f
pytorch_model-00005-of-00006.bin 698a8f2f05723a572193733bce12eb93
pytorch_model-00006-of-00006.bin 312439d0b810f1bb81034fe094ff84c7
Downloads last month
25
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for codefuse-ai/CodeFuse-13B

Quantizations
1 model

Collection including codefuse-ai/CodeFuse-13B