File size: 3,137 Bytes
38dcd4c 04c9c5b 38dcd4c df26af2 4b37c7f 202436c 696204f 3880981 38dcd4c 041b028 38dcd4c 041b028 38dcd4c 041b028 38dcd4c 041b028 38dcd4c 041b028 38dcd4c 4d1fbd6 2923e41 38dcd4c 4d1fbd6 38dcd4c 2923e41 38dcd4c 4d1fbd6 38dcd4c 11d61c0 38dcd4c 4d1fbd6 2923e41 4d1fbd6 2923e41 38dcd4c 2923e41 38dcd4c 2923e41 38dcd4c 2923e41 4d1fbd6 38dcd4c 4d1fbd6 38dcd4c 4d1fbd6 38dcd4c 4d1fbd6 11d61c0 38dcd4c 4d1fbd6 38dcd4c 4d1fbd6 38dcd4c 11d61c0 38dcd4c 4d1fbd6 38dcd4c 11d61c0 38dcd4c 4d1fbd6 38dcd4c 4d1fbd6 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 |
---
library_name: peft
base_model: TheBloke/Llama-2-7b-Chat-GPTQ
pipeline_tag: text-generation
inference: false
license: openrail
language:
- en
datasets:
- flytech/python-codes-25k
tags:
- text2code
- LoRA
- GPTQ
- Llama-2-7B-Chat
- text2python
- instruction2code
---
# Llama-2-7b-Chat-GPTQ fine-tuned on PYTHON-CODES-25K
Generate Python code that accomplishes the task instructed.
## LoRA Adpater Head
### Description
Parameter Efficient Finetuning(PEFT) a 4bit quantized Llama-2-7b-Chat from TheBloke/Llama-2-7b-Chat-GPTQ on flytech/python-codes-25k dataset.
- **Language(s) (NLP):** English
- **License:** openrail
- **Qunatization:** GPTQ 4bit
- **PEFT:** LoRA
- **Finetuned from model [TheBloke/Llama-2-7b-Chat-GPTQ](https://huggingface.co/TheBloke/Llama-2-7B-Chat-GPTQ)**
- **Dataset:** [flytech/python-codes-25k](https://huggingface.co/datasets/flytech/python-codes-25k)
## Intended uses & limitations
Addressing the efficay of Quantization and PEFT. Implemented as a personal Project.
### How to use
```
The quantized model is finetuned as PEFT. We have the trained Adapter.
Merging LoRA adapated with GPTQ quantized model is not yet supported.
So instead of loading a single finetuned model, we need to load the mase model and merge the finetuned adapter on top.
```
```python
instruction = """model_input = "Help me set up my daily to-do list!""""
```
```python
from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM
config = PeftConfig.from_pretrained("SwastikM/Llama-2-7B-Chat-text2code")
model = AutoModelForCausalLM.from_pretrained("TheBloke/Llama-2-7b-Chat-GPTQ")
model = PeftModel.from_pretrained(model, "SwastikM/Llama-2-7B-Chat-text2code")
tokenizer = AutoTokenizer.from_pretrained("SwastikM/Llama-2-7B-Chat-text2code")
inputs = tokenizer(instruction, return_tensors="pt").input_ids.to('cuda')
outputs = model.generate(inputs, max_new_tokens=500, do_sample=False, num_beams=1)
code = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(code)
```
## Training Details
### Training Data
[gretelai/synthetic_text_to_sql](https://huggingface.co/datasets/gretelai/synthetic_text_to_sql)
### Training Procedure
HuggingFace Accelerate with Training Loop.
#### Training Hyperparameters
- **Optimizer:** AdamW
- **lr:** 2e-5
- **decay:** linear
- **batch_size:** 4
- **gradient_accumulation_steps:** 8
- **global_step:** 625
#### Hardware
- **GPU:** P100
## Additional Information
- ***Github:*** [Repository]()
- ***Intro to quantization:*** [Blog](https://huggingface.co/blog/merve/quantization)
- ***Emergent Feature:*** [Academic](https://timdettmers.com/2022/08/17/llm-int8-and-emergent-features)
- ***GPTQ Paper:*** [GPTQ](https://arxiv.org/pdf/2210.17323)
- ***BITSANDBYTES and further*** [LLM.int8()](https://arxiv.org/pdf/2208.07339)
## Acknowledgment
Thanks to [@AMerve Noyan](https://huggingface.co/blog/merve/quantization) for precise intro.
Thanks to [@HuggungFace Team](https://colab.research.google.com/drive/1_TIrmuKOFhuRRiTWN94iLKUFu6ZX4ceb?usp=sharing#scrollTo=vT0XjNc2jYKy) for coding guide on gptq.
## Model Card Authors
Swastik Maiti |