flax-community
/

gpt-neo-125M-code-search-py

Text Generation

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

ncoop57 commited on Jul 22, 2021

Commit

c24a503

·

1 Parent(s): a5538ef

Create README.md

Files changed (1) hide show

README.md +55 -0

README.md ADDED Viewed

	@@ -0,0 +1,55 @@

+# GPT-Code-Clippy-125M-Code-Search-Py
+## Model Description
+GPT-CC-125M-Code-Search is a [GPT-Neo-125M model](https://huggingface.co/EleutherAI/gpt-neo-125M) finetuned using causal language modeling on only the python language in the [CodeSearchNet Challenge dataset](https://huggingface.co/datasets/code_search_net). This model is specialized to autocomplete methods in multiple the python language.
+## Training data
+[CodeSearchNet Challenge dataset](https://huggingface.co/datasets/code_search_net).
+## Training procedure
+The training script used to train this model can be found [here](https://github.com/ncoop57/gpt-code-clippy/blob/camera-ready/training/run_clm_apps.py).
+## Intended Use and Limitations
+The model is finetuned methods from the python language and is intended to autocomplete python methods given some prompt (method signature and docstring).
+### How to use
+You can use this model directly with a pipeline for text generation. This example generates a different sequence each time it's run:
+```py
+from transformers import AutoModelForCausalLM, AutoTokenizer, FlaxAutoModelForCausalLM
+model = AutoModelForCausalLM.from_pretrained("flax-community/gpt-neo-125M-code-clippy-code-search-py")
+tokenizer = AutoTokenizer.from_pretrained("flax-community/gpt-neo-125M-code-clippy-code-search-py")
+prompt = """def greet(name):
+  '''A function to greet user. Given a user name it should say hello'''
+"""
+input_ids = tokenizer(prompt, return_tensors='pt').input_ids.to(device)
+start = input_ids.size(1)
+out = model.generate(input_ids, do_sample=True, max_length=50, num_beams=2,
+                     early_stopping=True, eos_token_id=tokenizer.eos_token_id, )
+print(tokenizer.decode(out[0][start:]))
+```
+### Limitations and Biases
+The model is intended to be used for research purposes and comes with no guarantees of quality of generated code.
+GPT-CC is finetuned from GPT-Neo and might have inherited biases and limitations from it. See [GPT-Neo model card](https://huggingface.co/EleutherAI/gpt-neo-125M#limitations-and-biases) for details.
+## Eval results
+Coming soon...