AISE-TUDelft
/

CodeBERTa-ft-coco-2e-05lr

Text Classification

Inference Endpoints

Model card Files Files and versions Community

CodeBERTa-ft-coco-2e-05lr / README.md

Ar4l's picture

Upload README.md with huggingface_hub

c1fe8ed verified 5 months ago

|

2.82 kB

	---
	license: mit
	library_name: transformers
	tags:
	- code
	---

	## CodeBERTa-ft-coco-2e-05lr

	Model for the paper ["A Transformer-Based Approach for Smart Invocation of Automatic Code Completion"](https://arxiv.org/abs/2405.14753).

	#### Description
	This model is fine-tuned on a code-completion dataset collected from the open-source [Code4Me](https://github.com/code4me-me/code4me) plugin. The training objective is to have a small, lightweight transformer model to filter out unnecessary and unhelpful code completions. To this end, we leverage the in-IDE telemetry data, and integrate it with the textual code data in the transformer's attention module.

	- Developed by: [AISE Lab](https://www.linkedin.com/company/aise-tudelft/) @ [SERG](https://se.ewi.tudelft.nl/), Delft University of Technology
	- Model type: [RoBERTa](https://huggingface.co/FacebookAI/roberta-base)
	- Language: Code
	- Finetuned from model: [`CodeBERTa-small-v1`](https://huggingface.co/huggingface/CodeBERTa-small-v1).

	Models are named as follows:

	- `CodeBERTa` → `CodeBERTa-ft-coco-[1,2,5]e-05lr`
	- e.g. `CodeBERTa-ft-coco-2e-05lr`, which was trained with learning rate of `2e-05`.
	- `JonBERTa-head` → `JonBERTa-head-ft-(dense-proj-reinit)`
	- e.g. `JonBERTa-head-ft-(dense-proj-)`, where all have `2e-05` learning rate, but may differ in the head layer in which the telemetry features are introduced (either `head` or `proj`).
	- `JonBERTa-attn` → `JonBERTa-attn-ft-(0,1,2,3,4,5L)`
	- e.g. `JonBERTa-attn-ft-(0,1,2L)` , where all have `2e-05` learning rate, but may differ in the attention layer in which the telemetry features are introduced (either `0`, `1`, `2`, `3`, `4`, or `5L`).

	Other hyperparameters may be found in the paper or the replication package (see below).

	#### Sources

	- Replication Repository: [`Ar4l/curating-code-completions`](https://github.com/Ar4l/curating-code-completions/tree/main)
	- Paper: ["A Transformer-Based Approach for Smart Invocation of Automatic Code Completion"](https://arxiv.org/abs/2405.14753)
	- Contact: https://huggingface.co/Ar4l

	To cite, please use

	```bibtex
	@misc{de_moor_smart_invocation_2024,
	title = {A {Transformer}-{Based} {Approach} for {Smart} {Invocation} of {Automatic} {Code} {Completion}},
	url = {http://arxiv.org/abs/2405.14753},
	doi = {10.1145/3664646.3664760},
	author = {de Moor, Aral and van Deursen, Arie and Izadi, Maliheh},
	month = may,
	year = {2024},
	}
	```

	#### Training Details
	This model was trained with the following hyperparameters, everything else being `TrainingArguments`' default. The dataset was prepared identically across all models as detailed in the paper.

	```python
	num_train_epochs : int = 6
	learning_rate : float = search([2e-5, 1e-5, 5e-5])
	batch_size : int = 16
	```