|
--- |
|
language: |
|
- en |
|
- es |
|
- ru |
|
- de |
|
- pl |
|
- th |
|
- vi |
|
- sv |
|
- bn |
|
- da |
|
- he |
|
- it |
|
- fa |
|
- sk |
|
- id |
|
- nb |
|
- el |
|
- nl |
|
- hu |
|
- eu |
|
- zh |
|
- eo |
|
- ja |
|
- ca |
|
- cs |
|
- bg |
|
- fi |
|
- pt |
|
- tr |
|
- ro |
|
- ar |
|
- uk |
|
- gl |
|
- fr |
|
- ko |
|
task_categories: |
|
- conversational |
|
license: llama2 |
|
datasets: |
|
- Photolens/oasst1-langchain-llama-2-formatted |
|
--- |
|
|
|
## Model Overview |
|
Model license: Llama-2<br> |
|
This model is trained based on [NousResearch/Llama-2-7b-chat-hf](https://huggingface.co/NousResearch/Llama-2-7b-chat-hf) model that is QLoRA finetuned on [Photolens/oasst1-langchain-llama-2-formatted](https://huggingface.co/datasets/Photolens/oasst1-langchain-llama-2-formatted) dataset.<br> |
|
|
|
## Prompt Template: Llama-2 |
|
``` |
|
<s>[INST] Prompter Message [/INST] Assistant Message </s> |
|
``` |
|
|
|
## Intended Use |
|
Dataset that is used to finetune base model is optimized for langchain applications.<br> |
|
So this model is intended for a langchain LLM. |
|
|
|
## Training Details |
|
This model took `1:14:16` to train in QLoRA on a single `A100 40gb` GPU.<br> |
|
- *epochs*: `1` |
|
- *train batch size*: `8` |
|
- *eval batch size*: `8` |
|
- *gradient accumulation steps*: `1` |
|
- *maximum gradient normal*: `0.3` |
|
- *learning rate*: `2e-4` |
|
- *weight decay*: `0.001` |
|
- *optimizer*: `paged_adamw_32bit` |
|
- *learning rate schedule*: `cosine` |
|
- *warmup ratio (linear)*: `0.03` |
|
|
|
## Models in this series |
|
| Model | Train time | Size (in params) | Base Model | |
|
---|---|---|--- |
|
| [llama-2-7b-langchain-chat](https://huggingface.co/Photolens/llama-2-7b-langchain-chat/) | 1:14:16 | 7 billion | [NousResearch/Llama-2-7b-chat-hf](https://huggingface.co/NousResearch/Llama-2-7b-chat-hf) | |
|
| [llama-2-13b-langchain-chat](https://huggingface.co/Photolens/llama-2-13b-langchain-chat/) | 2:50:27 | 13 billion | [TheBloke/Llama-2-13B-Chat-fp16](https://huggingface.co/TheBloke/Llama-2-13B-Chat-fp16) | |
|
| [Photolens/OpenOrcaxOpenChat-2-13b-langchain-chat](https://huggingface.co/Photolens/OpenOrcaxOpenChat-2-13b-langchain-chat/) | 2:56:54 | 13 billion | [Open-Orca/OpenOrcaxOpenChat-Preview2-13B](https://huggingface.co/Open-Orca/OpenOrcaxOpenChat-Preview2-13B) | |