File size: 1,007 Bytes
b459d0c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
This model is funetune version of Codebert in robert. On CodeSearchNet.
###
Quick start:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("addy88/programming-lang-identifier")
model = AutoModelForSequenceClassification.from_pretrained("addy88/programming-lang-identifier")
input_ids = tokenizer.encode(CODE_TO_IDENTIFY)
logits = model(input_ids)[0]
language_idx = logits.argmax() # index for the resulting label
###
CodeSearchNet citation
Details
@article{husain_codesearchnet_2019,
title = {{CodeSearchNet} {Challenge}: {Evaluating} the {State} of {Semantic} {Code} {Search}},
shorttitle = {{CodeSearchNet} {Challenge}},
url = {http://arxiv.org/abs/1909.09436},
urldate = {2020-03-12},
journal = {arXiv:1909.09436 [cs, stat]},
author = {Husain, Hamel and Wu, Ho-Hsiang and Gazit, Tiferet and Allamanis, Miltiadis and Brockschmidt, Marc},
month = sep,
year = {2019},
note = {arXiv: 1909.09436},
} |