fine-tuned-codegen-6B-Verilog / README.md

Create README.md

a3489e4 over 1 year ago

4.55 kB

	---
	pipeline_tag: text-generation
	inference: true
	widget:
	- text: module display_hello_word
	example_title: Hello world
	group: Verilog
	license: bigcode-openrail-m
	datasets:
	- shailja/Verilog_GitHub
	library_name: transformers
	tags:
	- code
	model-index:
	- name: VeriGen
	results:
	- task:
	type: text-generation
	dataset:
	type: openai_humaneval
	name: VeriEval (Prompted)
	metrics:
	- name: pass@1
	type: pass@1
	value:
	verified: false
	extra_gated_prompt: >-
	## Model License Agreement

	Please read the BigCode [OpenRAIL-M
	license](https://huggingface.co/spaces/bigcode/bigcode-model-license-agreement)
	agreement before accepting it.

	extra_gated_fields:
	I accept the above license agreement, and will use the Model complying with the set of use restrictions and sharing requirements: checkbox
	---


	# VeriGen


	## Table of Contents

	1. [Model Summary](##model-summary)
	2. [Use](##use)
	3. [Limitations](##limitations)
	4. [Training](##training)
	5. [License](##license)
	6. [Citation](##citation)

	## Model Summary

	The VeriGen model is 6B parameter models fine-tuned version of [CodeGen-multi-16B](https://github.com/salesforce/codegen) trained on [Verilog code dataset](https://huggingface.co/datasets/shailja/Verilog_GitHub) .

	- Repository: [shailja-thakur/VGen](https://github.com/shailja-thakur/VGen)
	- Baseline LLM [SalesForce/CodeGen](https://github.com/salesforce/CodeGen)
	- Paper: [ Benchmarking Large Language Models for Automated Verilog RTL Code Generation](https://arxiv.org/abs/2212.11140)
	- Point of Contact: [contact@shailja](mailto:[email protected])
	- Languages: Verilog (Hardware Description Language)


	## Use

	### Intended use

	The model was trained on Verilog from GitHub and textbooks. As such it is _not_ an instruction model and commands like "Write a module that implements a 2-to-1 Mux." do not work well. However, by additing a partial line of module header like "module mux" in addition with the text in the prompt turns it into a capable Verilog teaching assistant.

	Feel free to share your generations in the Community tab!

	### Generation
	```python
	# pip install -q transformers
	import torch
	from transformers import AutoTokenizer, AutoModelForCausalLM
	# Prompt
	prompt = "//module half adder "
	device='cuda'
	# Load model and tokenizer
	model_name = "shailja/fine-tuned-codegen-6B-Verilog"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForCausalLM.from_pretrained(model_name).to(device)

	# Sample
	input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to(device)
	sample = model.generate(input_ids, max_length=128, temperature=0.5, top_p=0.9)

	print(tokenizer.decode(sample[0], truncate_before_pattern=[r"endmodule"]) + "endmodule")
	```


	### Attribution & Other Requirements

	The pretraining dataset of the model was not filtered for permissive licenses only. Nevertheless, the model can generate source code verbatim from the dataset. The code's license might require attribution and/or other specific requirements that must be respected.

	# Limitations

	The model has been trained on Verilog source code from open sources. The predominant natural language in source code is English, although other languages are also present. As such the model is capable of generating Verilog snippets provided some context but the generated code is not guaranteed to work as intended. It can be inefficient, contain bugs or exploits. See [the paper](https://drive.google.com/file/d/1cN-b9GnWtHzQRoE7M7gAEyivY0kl4BYs/view) for an in-depth discussion of the model limitations.

	# Training

	## Model

	- Architecture: GPT-2 model with multi-query attention
	- Pretraining steps: 150k
	- Pretraining tokens: ~72B
	- Precision: fp16

	## Hardware

	- GPUs: 4 Tesla A100
	- Training time: 10 days


	# License
	The model is licensed under the BigCode OpenRAIL-M v1 license agreement. You can find the full agreement [here](https://huggingface.co/spaces/bigcode/bigcode-model-license-agreement).
	# Citation
	```
	@misc{https://doi.org/10.48550/arxiv.2212.11140,
	doi = {10.48550/ARXIV.2212.11140},
	url = {https://arxiv.org/abs/2212.11140},
	author = {Thakur, Shailja and Ahmad, Baleegh and Fan, Zhenxing and Pearce, Hammond and Tan, Benjamin and Karri, Ramesh and Dolan-Gavitt, Brendan and Garg, Siddharth},
	title = {Benchmarking Large Language Models for Automated Verilog RTL Code Generation},
	publisher = {arXiv},
	year = {2022},
	copyright = {arXiv.org perpetual, non-exclusive license}
	}
	```