|
--- |
|
language: |
|
- eng |
|
tags: |
|
- llama-2 |
|
- sft |
|
license: [mit] |
|
--- |
|
|
|
![puffin](https://i.imgur.com/R2xTHMb.png) |
|
|
|
## **Redmond-Puffin-13b-V1.3** |
|
|
|
**The first commercially available language model released by Nous Research!** |
|
|
|
Redmond-Puffin-13B is one of the worlds first llama-2 based, fine-tuned language models, leveraging a hand curated set of 3K high quality examples, many of which take full advantage of the 4096 context length of Llama 2. This model was fine-tuned by Nous Research, with LDJ leading the training and dataset curation, along with significant dataset formation contributions by J-Supha. |
|
|
|
Special thank you to Redmond AI for sponsoring the compute. |
|
|
|
Special thank you to Emozilla for assisting with training experimentations and many issues encountered during training. |
|
|
|
Notable mentions for assisting in some of the training issues goes to: Caseus and Teknium. |
|
|
|
## Model Training |
|
|
|
Redmond-Puffin-13B-V1.3 is a new model trained for multiple epochs on a dataset of 3,000 carefully curated GPT-4 examples, most of which are long context conversations between a real human and GPT-4. |
|
|
|
Additional data came from carefully curated sub sections of datasets such as CamelAI's Physics, Chemistry, Biology and Math. |
|
|
|
## Prompt Format |
|
|
|
The model follows the Vicuna ShareGPT prompt format: |
|
|
|
``` |
|
### human: |
|
|
|
### gpt: |
|
``` |
|
|
|
## Improvements over previous version: |
|
|
|
The original Puffin model was loved by many, however it was quickly discovered to have dataset errors in a significant amount of the conversations. |
|
Puffin-V1.3 dataset solves this issue and the resulting fixed model has now fully finished training! |
|
|
|
|
|
## Notable Features: |
|
|
|
- The first Llama-2 based fine-tuned model released by Nous Research. |
|
|
|
- Ability to recall information upto 2023 without internet (ChatGPT cut off date is in 2021) |
|
|
|
- Pretrained on 2 trillion tokens of text. (This is double the amount of most Open LLM's) |
|
|
|
- Pretrained with a context length of 4096 tokens, and fine-tuned on a significant amount of multi-turn conversations reaching that full token limit. |
|
|
|
- The first commercially available language model released by Nous Research. |
|
|
|
## Current Limitations |
|
|
|
Some token mismatch problems and formatting issues have been idenitifed, these may very possibly effect the current output quality. |
|
|
|
We plan to have these solved in an updated Puffin model in the very near future, please stay tuned! |
|
|
|
## Future Plans |
|
|
|
This is a relatively early build amongst the grand plans for the future of Puffin! |
|
|
|
Current limitations: Some token mismatch problems have been identified, these may effect the current output quality, we plan to have this solved in Puffin V2 along with other improvements. |
|
|
|
## How you can help! |
|
|
|
In the near future we plan on leveraging the help of domain specific expert volunteers to eliminate any mathematically/verifiably incorrect answers from our training curations. |
|
|
|
If you have at-least a bachelors in mathematics, physics, biology or chemistry and would like to volunteer even just 30 minutes of your expertise time, please contact ldj on discord! |
|
|
|
## Benchmarks coming soon |
|
|
|
benchmarks coming soon! |