---
license: apache-2.0
datasets:
- berkeley-nest/Nectar
language:
- en
library_name: transformers
tags:
- reward model
- RLHF
- RLAIF
quantized_by: bartowski
pipeline_tag: text-generation
lm_studio:
param_count: 7b
use_case: general
release_date: 19-03-2024
model_creator: Nexusflow
prompt_template: OpenChat
system_prompt: none
base_model: mistral
original_repo: Nexusflow/Starling-LM-7B-beta
---
## 💫 Community Model> Starling-LM-7B-beta by Nexusflow
*👾 [LM Studio](https://lmstudio.ai) Community models highlights program. Highlighting new & noteworthy models by the community. Join the conversation on [Discord](https://discord.gg/aPQfnNkxGC)*.
**Model creator:** [Nexusflow](https://huggingface.co/Nexusflow)
**Original model**: [Starling-LM-7B-beta](https://huggingface.co/Nexusflow/Starling-LM-7B-beta)
**GGUF quantization:** provided by [bartowski](https://huggingface.co/bartowski) based on `llama.cpp` release [b2440](https://github.com/ggerganov/llama.cpp/releases/tag/b2440)
## Model Summary:
Starling-LM-7B-beta is an iteration on the Starling series of models, building on the work from [berkeley-nest](https://huggingface.co/berkeley-nest) who released [Starling-LM-7B-alpha](https://huggingface.co/berkeley-nest/Starling-LM-7B-alpha).
This model is finetuned from [openchat/openchat-3.5-0106](https://huggingface.co/openchat/openchat-3.5-0106) based on [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1).
Using [Nexusflow/Starling-RM-34B](https://huggingface.co/Nexusflow/Starling-RM-34B) as the reward model and a policy optimization method of [Fine-Tuning Language Models from Human Preferences (PPO)](https://arxiv.org/abs/1909.08593) to yield impressive results for such a small model.
## Prompt Template:
Choose the 'OpenChat' preset in your LM Studio.
Under the hood, the model will see a prompt that's formatted like so:
```
GPT4 Correct User: {prompt}<|end_of_turn|>GPT4 Correct Assistant:
```
You can also choose the 'OpenChatCode' for a coding assistant.
Under the hood, the model will see a prompt that's formatted like so:
```
Code User: {prompt}<|end_of_turn|>Code Assistant:
```
## Use case and examples
Starling-LM-7B-beta is tuned to achieve impressive multi-turn chat performance. The base model was designed for being a generalist and coding tuned model, so it will work well for general tasks as well as coding.
### General knowledge:
GPT4 Correct User: Why is the sky blue?<|end_of_turn|>
GPT4 Correct Assistant: The sky appears blue due to a phenomenon called Rayleigh scattering, named after British scientist Lord Rayleigh who first studied it in detail. Here's a simplified explanation of why this occurs:
1. Sunlight is composed of various colors of light, each with different wavelengths. These colors include red, orange, yellow, green, blue, indigo, and violet, which together form the visible spectrum.
2. When sunlight enters Earth's atmosphere, it interacts with molecules of air (mainly nitrogen and oxygen) and small particles like dust and water droplets. These interactions cause the different colors of light to scatter in various directions.
3. Rayleigh scattering is more effective at shorter wavelengths, meaning that blue and violet light are scattered more than red and yellow light. However, our eyes are more sensitive to blue light than violet, so we perceive the sky as blue rather than a mixture of blue and violet.
4. At sunrise and sunset, the sun is lower in the sky, and its light has to travel through more atmosphere. This causes even more scattering of shorter wavelengths (blue and violet), leaving longer wavelengths (red, orange, and yellow) to dominate, resulting in the warm colors we see during these times.
5. In some cases, such as after a rainstorm or in areas with high pollution levels, the sky may appear more grayish or hazy. This is because the scattering of light is less selective, and more of the shorter wavelengths (blue) are scattered out of our line of sight, along with the longer wavelengths (red).
In summary, the sky appears blue due to the selective scattering of sunlight by the atmosphere, with shorter wavelengths (blue) being scattered more than longer wavelengths (red), and our eyes being more sensitive to blue light.
### Coding:
Code User: Implement quicksort using C++<|end_of_turn|>
Code Assistant: Here is an example of how to implement the quicksort algorithm in C++:
```cpp
#include
#include
using namespace std;
void swap(int* a, int* b) {
int t = *a;
*a = *b;
*b = t;
}
int partition(vector& arr, int low, int high) {
int pivot = arr[high];
int i = (low - 1);
for (int j = low; j <= high - 1; j++) {
if (arr[j] < pivot) {
i++;
swap(&arr[i], &arr[j]);
}
}
swap(&arr[i + 1], &arr[high]);
return (i + 1);
}
void quickSort(vector& arr, int low, int high) {
if (low < high) {
int pi = partition(arr, low, high);
quickSort(arr, low, pi - 1);
quickSort(arr, pi + 1, high);
}
}
int main() {
vector arr = {10, 7, 8, 9, 1, 5};
int n = arr.size();
quickSort(arr, 0, n - 1);
cout << "Sorted array: \n";
for (int i = 0; i < n; i++) {
cout << arr[i] << " ";
}
cout << endl;
return 0;
}
```
This code defines a `quickSort` function that takes a vector of integers, a starting index (low), and an ending index (high) as parameters. The function uses the partition method to find the pivot element's correct position in the sorted array and then recursively sorts the sub-arrays before and after the pivot.
The `main` function demonstrates how to use the `quickSort` function by sorting an example vector of integers and printing the sorted result.
## Technical Details
Uses [Nexusflow/Starling-RM-34B](https://huggingface.co/Nexusflow/Starling-RM-34B) as a reward model for PPO.
Nexusflow/Starling-RM-34B was trained on the following dataset:
- [berkeley-nest/Nectar](https://huggingface.co/datasets/berkeley-nest/Nectar)
## Special thanks
🙏 Special thanks to [Georgi Gerganov](https://github.com/ggerganov) and the whole team working on [llama.cpp](https://github.com/ggerganov/llama.cpp/) for making all of this possible.
## Disclaimers
TBD