--- license: apache-2.0 datasets: - lambdasec/cve-single-line-fixes - lambdasec/gh-top-1000-projects-vulns language: - code tags: - code programming_language: - Java - JavaScript - Python inference: false model-index: - name: SantaFixer results: - task: type: text-generation dataset: type: openai/human-eval-infilling name: HumanEval metrics: - name: single-line infilling pass@1 type: pass@1 value: 0.28 verified: false - name: single-line infilling pass@10 type: pass@10 value: 0.28 verified: false - task: type: text-generation dataset: type: lambdasec/gh-top-1000-projects-vulns name: GH Top 1000 Projects Vulnerabilities metrics: - name: pass@10 (Java) type: pass@10 value: 0.1 verified: false - name: pass@10 (Python) type: pass@10 value: 0.2 verified: false - name: pass@10 (JavaScript) type: pass@10 value: 0.3 verified: false --- # Model Card for SantaFixer This is a LLM for code that is focussed on generating bug fixes using infilling. ## Model Details ### Model Description - **Developed by:** [codelion](https://huggingface.co/codelion) - **Model type:** GPT-2 - **Finetuned from model:** [bigcode/santacoder](https://huggingface.co/bigcode/santacoder) ## Uses ### Direct Use [More Information Needed] ## How to Get Started with the Model Use the code below to get started with the model. [More Information Needed] ## Training Details - **GPU:** Tesla P100 - **Time:** ~5 hrs ### Training Data The model was fine-tuned on the [CVE single line fixes dataset](https://huggingface.co/datasets/lambdasec/cve-single-line-fixes) ### Training Procedure Supervised Fine Tuning (SFT) #### Training Hyperparameters - **optim:** adafactor - **gradient_accumulation_steps:** 4 - **gradient_checkpointing:** true - **fp16:** false ## Evaluation ### Testing Data, Factors & Metrics #### Testing Data [More Information Needed] ### Results [More Information Needed] #### Summary [More Information Needed]