Model Card for Fine-tuned Deepseek R1 Distill Llama-3.1 8B on SWE Bench Dataset
Model Overview
This model has been fine-tuned on the princeton-nlp/SWE-bench_Lite
dataset to automate the generation of bug fixes in software engineering tasks. It leverages issue descriptions, code diffs, and historical bug context to generate precise patches. The primary use case is to assist developers by quickly generating code fixes based on detailed bug descriptions.
Key Features:
- Patch Generation: Produces code patches based on issue descriptions and optional context.
- Contextual Fixes: Uses code diffs, bug reports, and PR history for more accurate bug fixes.
- Assertions: Ensures generated patches adhere to certain conditions.
Intended Use
The model is designed for developers and software teams to automatically generate code patches for software issues. It can handle a variety of inputs such as issue descriptions and additional context and is ideal for teams dealing with frequent bug reports.
Inputs:
- Issue Description (
<issue>
): Main input, taken fromfix_issue_description
. - Issue Story (
<fix_issue_story>
): Optional additional context, fromfix_story
. - Assertions (
<assertions>
): Conditions that the patch must meet, fromfix_assertion_1
,fix_assertion_2
, etc. - Bug and PR Context (
<bug>
): Historical bug and PR context from fields likebug_pr
,bug_story
, etc., used during fine-tuning but not required for inference. - File Paths and Code Differences (
<file>
,<bug_code_diff>
,<fix_code_diff>
): Paths and diffs used to generate a valid code patch.
Outputs:
- Generated Code Patch: Based on input issue description and additional context.
- Assertions: Conditions to ensure that the patch meets the desired criteria.
Dataset
princeton-nlp/SWE-bench_Lite
This model is fine-tuned on the swe_bench
dataset. The dataset includes:
- (Issue Description): Describes the bug in detail.
- (Issue Story): Provides additional narrative or context around the bug.
- : Related to the fix to ensure certain conditions are met.
- (Bug and PR Context): Provides historical context on bugs, used in fine-tuning.
- , , : File paths and code diffs used to train the model to generate fixes.
Limitations
- The model relies on well-defined issue descriptions to produce accurate patches.
- The contextual data from past bug reports is leveraged during fine-tuning and may not generalize to all types of bug fixes.
Usage
from transformers import LlamaForCausalLM, LlamaTokenizer
# Load the model and tokenizer
model = LlamaForCausalLM.from_pretrained('anant58/swe-model')
tokenizer = LlamaTokenizer.from_pretrained('anant58/swe-model')
# Example input (issue description)
issue_description = "Function X throws an error when Y happens."
inputs = tokenizer(issue_description, return_tensors="pt")
# Generate a patch
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0]))
Ethical Considerations
- The generated patches should always be validated and reviewed by developers before deploying in production.
- This model is designed to assist but should not replace thorough code reviews.
- Downloads last month
- 7
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for Anant58/swe-model
Base model
deepseek-ai/DeepSeek-R1-Distill-Llama-8B