|
--- |
|
library_name: peft |
|
base_model: mistralai/Mistral-7B-v0.1 |
|
language: |
|
- en |
|
pipeline_tag: text-generation |
|
widget: |
|
- text: "How many helicopters can a human eat in one sitting?" |
|
tags: |
|
- Δ |
|
- LoRA |
|
--- |
|
|
|
<!-- |
|
# Model Card for Model ID |
|
--> |
|
|
|
## Model Details |
|
|
|
<!--![image/png](https://cdn-uploads.huggingface.co/production/uploads/648b0f4fd8fe693f51de98d2/aerBANxBtCya732NdBiw0.png)--> |
|
$$ |
|
W_{mistral} + LoRA_{zephyr} = W_{zephyr} \\ |
|
W_{zephyr} - LoRA_{zephyr} = W_{mistral} |
|
$$ |
|
|
|
<!-- |
|
$$ W_{mistral} + LoRA_{zephyr} = W_{zephyr} $$ |
|
``` |
|
typeof/zephyr-7b-beta-lora + mistralai/Mistral-7B-v0.1 |
|
= HuggingFaceH4/zephyr-7b-beta |
|
```` |
|
|
|
### Model Description |
|
|
|
- **Developed by:** [More Information Needed] |
|
- **Funded by [optional]:** [More Information Needed] |
|
- **Shared by [optional]:** [More Information Needed] |
|
- **Model type:** [More Information Needed] |
|
- **Language(s) (NLP):** [More Information Needed] |
|
- **License:** [More Information Needed] |
|
- **Finetuned from model [optional]:** [More Information Needed] |
|
|
|
|
|
### Model Sources [optional] |
|
|
|
- **Repository:** [More Information Needed] |
|
- **Paper [optional]:** [More Information Needed] |
|
- **Demo [optional]:** [More Information Needed] |
|
|
|
## Uses |
|
|
|
### Direct Use |
|
|
|
[More Information Needed] |
|
|
|
### Downstream Use [optional] |
|
|
|
[More Information Needed] |
|
|
|
### Out-of-Scope Use |
|
|
|
[More Information Needed] |
|
|
|
## Bias, Risks, and Limitations |
|
|
|
[More Information Needed] |
|
|
|
### Recommendations |
|
|
|
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. |
|
--> |
|
|
|
### Model Sources |
|
[HuggingFaceH4/zephyr-7b-beta](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta) |
|
|
|
## How to Get Started with the Model |
|
|
|
Use the code below to get started with the model. |
|
|
|
```python |
|
# pip install transformers peft |
|
|
|
import torch |
|
from transformers import pipeline, AutoModelForCausalLM, AutoTokenizer |
|
|
|
model_id = "mistralai/Mistral-7B-v0.1" |
|
peft_model_id = "typeof/zephyr-7b-beta-lora" |
|
|
|
model = AutoModelForCausalLM.from_pretrained(model_id) |
|
model.load_adapter(peft_model_id) |
|
|
|
tokenizer_id = "HuggingFaceH4/zephyr-7b-beta" # for chat template etc... |
|
tokenizer = AutoTokenizer.from_pretrained(tokenizer_id) |
|
|
|
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer) |
|
|
|
messages = [ |
|
{ |
|
"role": "system", |
|
"content": "You are a friendly chatbot who always responds in the style of a pirate", |
|
}, |
|
{"role": "user", "content": "How many helicopters can a human eat in one sitting?"}, |
|
] |
|
prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) |
|
outputs = pipe(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95) |
|
print(outputs[0]["generated_text"]) |
|
``` |
|
``` |
|
<|system|> |
|
You are a friendly chatbot who always responds in the style of a pirate</s> |
|
<|user|> |
|
How many helicopters can a human eat in one sitting?</s> |
|
<|assistant|> |
|
Well, me matey, that’s a good question indeed! I’ve never seen |
|
a human eat a helicopter, and I don’t think many others have |
|
either. However, I’ve heard rumors that some people have |
|
eaten entire airplanes, so I suppose it’s not entirely unheard |
|
of. |
|
|
|
As for the number of helicopters one could eat, that depends |
|
on the size and weight of the helicopter. A small, lightweight |
|
helicopter would be easier to eat than a large, heavy one. |
|
In fact, I’ve heard that some people have eaten entire helicopters |
|
as part of a dare or a challenge. |
|
|
|
So, my advice to you, me hearty, is to steer clear of helicopters |
|
and stick to more traditional fare. Yarr!</s> |
|
``` |
|
<!-- |
|
|
|
## Training Details |
|
|
|
### Training Data |
|
|
|
|
|
[More Information Needed] |
|
|
|
### Training Procedure |
|
|
|
|
|
#### Preprocessing [optional] |
|
|
|
[More Information Needed] |
|
|
|
|
|
#### Training Hyperparameters |
|
|
|
#### Speeds, Sizes, Times [optional] |
|
|
|
|
|
[More Information Needed] |
|
|
|
## Evaluation |
|
|
|
|
|
### Testing Data, Factors & Metrics |
|
|
|
#### Testing Data |
|
|
|
|
|
[More Information Needed] |
|
|
|
#### Factors |
|
|
|
|
|
[More Information Needed] |
|
|
|
#### Metrics |
|
|
|
|
|
[More Information Needed] |
|
|
|
### Results |
|
|
|
[More Information Needed] |
|
|
|
#### Summary |
|
|
|
## Model Examination [optional] |
|
|
|
[More Information Needed] |
|
|
|
## Technical Specifications [optional] |
|
|
|
### Model Architecture and Objective |
|
|
|
[More Information Needed] |
|
|
|
### Compute Infrastructure |
|
|
|
[More Information Needed] |
|
|
|
#### Hardware |
|
|
|
[More Information Needed] |
|
|
|
#### Software |
|
|
|
[More Information Needed] |
|
|
|
## Citation [optional] |
|
|
|
**BibTeX:** |
|
|
|
[More Information Needed] |
|
|
|
**APA:** |
|
|
|
[More Information Needed] |
|
|
|
## Glossary [optional] |
|
|
|
[More Information Needed] |
|
|
|
## More Information |
|
|
|
[More Information Needed] |
|
|
|
## Model Card Authors [optional] |
|
|
|
[More Information Needed] |
|
|
|
## Model Card Contact |
|
|
|
[More Information Needed] |
|
|
|
## Training procedure |
|
|
|
The following `bitsandbytes` quantization config was used during training: |
|
- quant_method: bitsandbytes |
|
- load_in_4bit: True |
|
- bnb_4bit_quant_type: nf4 |
|
- bnb_4bit_use_double_quant: True |
|
|
|
### Framework versions |
|
|
|
- PEFT 0.6.3.dev0 |
|
|
|
--> |
|
#### Summary |
|
|
|
[Zephyr-7B-β](https://arxiv.org/abs/2305.18290) is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) |
|
[Zephyr-7B technical report](https://arxiv.org/abs/2310.16944) |
|
|
|
[LoRA](https://arxiv.org/abs/2305.14314) |
|
[QLoRA](https://arxiv.org/abs/2106.09685) |