license: mit
library_name: transformers
tags:
- code
CodeBERTa-ft-coco-1e-05lr
Model for the paper "A Transformer-Based Approach for Smart Invocation of Automatic Code Completion".
Description
This model is fine-tuned on a code-completion dataset collected from the open-source Code4Me plugin. The training objective is to have a small, lightweight transformer model to filter out unnecessary and unhelpful code completions. To this end, we leverage the in-IDE telemetry data, and integrate it with the textual code data in the transformer's attention module.
- Developed by: AISE Lab @ SERG, Delft University of Technology
- Model type: RoBERTa
- Language: Code
- Finetuned from model:
CodeBERTa-small-v1
.
Models are named as follows:
CodeBERTa
→CodeBERTa-ft-coco-[1,2,5]e-05lr
- e.g.
CodeBERTa-ft-coco-2e-05lr
, which was trained with learning rate of2e-05
.
- e.g.
JonBERTa-head
→JonBERTa-head-ft-(dense-proj-reinit)
- e.g.
JonBERTa-head-ft-(dense-proj-)
, where all have2e-05
learning rate, but may differ in the head layer in which the telemetry features are introduced (eitherhead
orproj
).
- e.g.
JonBERTa-attn
→JonBERTa-attn-ft-(0,1,2,3,4,5L)
- e.g.
JonBERTa-attn-ft-(0,1,2L)
, where all have2e-05
learning rate, but may differ in the attention layer in which the telemetry features are introduced (either0
,1
,2
,3
,4
, or5L
).
- e.g.
Other hyperparameters may be found in the paper or the replication package (see below).
Sources
- Replication Repository:
Ar4l/curating-code-completions
- Paper: "A Transformer-Based Approach for Smart Invocation of Automatic Code Completion"
- Contact: https://huggingface.co/Ar4l
To cite, please use
@misc{de_moor_smart_invocation_2024,
title = {A {Transformer}-{Based} {Approach} for {Smart} {Invocation} of {Automatic} {Code} {Completion}},
url = {http://arxiv.org/abs/2405.14753},
doi = {10.1145/3664646.3664760},
author = {de Moor, Aral and van Deursen, Arie and Izadi, Maliheh},
month = may,
year = {2024},
}
Training Details
This model was trained with the following hyperparameters, everything else being TrainingArguments
' default. The dataset was prepared identically across all models as detailed in the paper.
num_train_epochs : int = 6
learning_rate : float = search([2e-5, 1e-5, 5e-5])
batch_size : int = 16