BLOOMChat-176B-v1 / README.md
jayr014's picture
adding in link for BigScience
83e22f4
|
raw
history blame
28 kB
metadata
license: other

BLOOMChat V1.0

BLOOMChat is BigScience Group BLOOM model instruction-tuned on a subset of 100k datapoints per data source from the OIG dataset from the OpenChatKit. Then aligned using Dolly 2.0 and Oasst1.

Model Details

Model Description

Additional Information

  • Blog Post: [More Information Needed]
  • Discord: Link
  • HF Hosting: [More Information Needed]
  • Github: Link

Licensing

To increase accessibility and to support the open-source community, SambaNova is releasing BLOOMChat under a modified version of the Apache 2.0 license, which includes use-based restrictions from BLOOM’s RAIL license. While use-based restrictions are necessarily passed through, there are no blanket restrictions on reuse, distribution, commercialization or adaptation. Please review SambaNova’s BLOOMChat-176B License

Uses

Click to expand

Direct Use

This model is intended for commercial and research use.

Out-of-Scope Use

BLOOMChat should NOT be used for:

  • Mission-critical applications
  • Applications that involve the safety of others
  • Making highly important decisions
  • Important automated pipelines

This model is still in early development and can be prone to mistakes and hallucinations, there is still room for improvement. This model is intended to provide the community with a good baseline.

Recommendations

Users should be made aware of the risks, biases, limitations, and restrictions of the model, which are listed down at the bottom of the page.


How to Get Started with the Model

Click to expand

Use the code below to get started with the model.

from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("sambanovasystems/BLOOMChat-176B-v1")
model = AutoModelForCausalLM.from_pretrained("sambanovasystems/BLOOMChat-176B-v1", device_map="auto", torch_dtype="auto")

Tutorial on using the model for text generation

This tutorial from Huggingface will be the base layer for running our model. The tutorial is intended for BLOOM; however, since our model is based off of BLOOM we can repurpose it.

For setup instructions follow the Huggingface tutorial.

NOTE: Things that we had to modify in order for BLOOMChat to work:

  • Install transformers version 4.27.0
    • pip install transformers==4.27.0
  • Change the model name from bigscience/bloom to sambanovasystems/BLOOMChat-176B-v1
  • Modifying inference_server/models/hf_accelerate.py
    • This is because for our testing of this repo we used 4 80GB A100 GPUs and would run into memory issues
  • Modifying inference_server/cli.py
    • This is because the model was trained using specific human, bot tags
    • Trailing spaces may lead to subpar performance

Modifications for inference_server/models/hf_accelerate.py:

from accelerate.utils.modeling import get_max_memory
...
class HFAccelerateModel(Model):
    def __init__(self, args: Namespace) -> None:
        ...
        original_max_memory_dict = get_max_memory()

        reduce_max_memory_dict = {device_key: int(original_max_memory_dict[device_key] * 0.85) for device_key in original_max_memory_dict}

        kwargs["max_memory"] = reduce_max_memory_dict

Modifications for inference_server/cli.py:

def main() -> None:
    ...
    while True:
        input_text = input("Input text: ")

        input_text = input_text.strip()
        modified_input_text = f"<human>: {input_text}\n<bot>:"

Running command for bf16

python -m inference_server.cli --model_name sambanovasystems/BLOOMChat-176B-v1 --model_class AutoModelForCausalLM --dtype bf16 --deployment_framework hf_accelerate --generate_kwargs '{"do_sample": false, "temperature": 0.8, "repetition_penalty": 1.2, "top_p": 0.9, "max_new_tokens": 512}'

Running command for int8 (sub optimal performance, but fast inference time):

python -m inference_server.cli --model_name sambanovasystems/BLOOMChat-176B-v1 --model_class AutoModelForCausalLM --dtype int8 --deployment_framework hf_accelerate --generate_kwargs '{"do_sample": false, "temperature": 0.8, "repetition_penalty": 1.2, "top_p": 0.9, "max_new_tokens": 512}'

DISCLAIMER: When using int8, the results will be subpar compared to bf16 as the model is being quantized.

Suggested Inference Parameters

  • Temperature: 0.8
  • Repetition penalty: 1.2
  • Top-p: 0.9
  • Max generated tokens: 512

Suggested Prompts To Try

<human>: Write a script in which Bob accidentally breaks his dad's guitar
<bot>:
<human>: give a python code to open a http server in 8080 port using python 3.7
<bot>:
<human>: Create an itemized list of tasks to complete to start a clothing brand
<bot>:

Some example completions for English

Click to expand
<human>: What does it mean for the Fed to raise or lower interest rates?
<bot>: The Federal Reserve (also known as the Fed) is the central banking system of the United States.  It was created in 1913, and is an independent agency of the federal government.  The Fed has a dual mandate of maximizing employment and stabilizing prices.  The Fed sets interest rates, and can influence the money supply and the credit markets through its actions.

Raising interest rates means that the Fed is trying to reduce the money supply in the economy, and is trying to discourage borrowing and encourage saving.  Lowering interest rates means that the Fed is trying to stimulate economic growth by encouraging borrowing and reducing the cost of credit.

The Fed raises or lowers interest rates in order to influence the economy and achieve its goals for employment and price stability.  Interest rates are one of the primary tools that the Fed uses to achieve its goals.
<human>: Write an engaging and well-written property listing description for selling a house. Address of the house and some of the details are given to you. Fill in the information gap with hallucinations if needed.

Property Address: 412 Monterey Ave, Capitola, CA 95010
Parking: 3 cars + a finished single car garage
Details:
- open floorplan
- window with views to park/mountains
- Kitchen with shaker cabinetry and white Calcatta quartz counters
<bot>: This home at 412 Monterey Ave, Capitola, CA 95010 is the perfect place to call home. With its open floor plan, it provides the perfect space for entertaining guests. The kitchen is equipped with shaker cabinetry and white Calcatta quartz counters, providing the perfect space to create culinary masterpieces. The window with views to the park and mountains provide a beautiful view of the outdoors, making this home feel like a piece of paradise. The three-car garage and finished single car garage provide ample parking space, making this home a must-see for anyone looking for a new home.
<human>: What is artificial intelligence? Explain it so an 8 year old will understand.
<bot>: Artificial Intelligence (AI) is a computer program or machine that can do things that would normally require human intelligence, such as recognizing patterns, making decisions, and learning from experience. AI can be found in many places, such as self-driving cars, smartphones, and even robots. AI is used to make things easier for people, like when your smartphone can translate text into another language or when your voice-controlled speaker can play music for you.
AI works by using algorithms and big data to learn and improve over time. For example, self-driving cars use AI to analyze the environment around them and make decisions about how to drive. The car gathers data from its sensors, processes it using algorithms, and adjusts its behavior accordingly. This process is called machine learning.
AI is still a developing technology, and there are still many things that AI programs cannot do yet. However, as AI continues to evolve, it is likely that it will become more and more capable of performing tasks that were once thought to require human intelligence.

Some example completions for Multilingual

Click to expand

Arabic

<human>: هل العمل نصف دوام فكرة جيدة لمراهقين بين عمر ١٥ و ١٩ عام
<bot>: نعم، يمكن أن تكون فكرة جيدة لبعض المراهقين في هذا العمر. هناك العديد من الفوائد التي قد يحققها المراهقون إذا عملوا بنصف دوام:

- يساعدهم على تعلم المسؤولية وإدارة الوقت: عندما يعمل المراهق بنصف دوام، فإنه يتعلم كيفية إدارة وقته بشكل فعال وكيفية تقسيم ساعات عمله إلى مهام أصغر.
- يحسن مهارات التواصل والعمل الجماعي: غالبًا ما يكون مكان العمل بيئة متعددة الثقافات حيث يلتقي المراهقون بمجموعة متنوعة من الأشخاص ويتعلمون كيفية التعاون معهم وتبادل الأفكار.
- يوفر فرصة للتعرف على عالم العمل: يتيح العمل للمراهق التعرف على عالم العمل واكتساب خبرة عملية قيمة ستساعده لاحقًا عند البحث عن وظيفة أو بدء حياته المهنية.
- يعزز الشعور بالمسؤولية والثقة بالنفس: عندما يعمل المراهق بنصف دوام ويحصل على دخل خاص به، فإنه يكتسب شعورًا أكبر بالمسؤولية والثقة بالنفس.

ومع ذلك، لا ينبغي إجبار المراهقين على العمل بنصف دوام دون موافقتهم المسبقة. يجب أيضًا مراعاة أعمارهم وقدراتهم البدنية والعقلية قبل توظيفهم.

DISCLAIMER: Text ordering may be off for Arabic due to code block not being able to render

Chinese

<human>: 请向我推荐一部周末看的电影并说明原因。
<bot>: “强烈向您推荐《美丽人生》这部影片。该片由罗伯托·贝尼尼自导自演,讲述了二战时期一对犹太父子被送进了纳粹集中营,父亲用自己的想象力为儿子编织了一个美好童真的世界,让儿子的心灵没有受到丝毫伤害的故事。

该片被誉为“最伟大的战争片”、“人类有史以来最佳影片之一”,豆瓣评分高达9.6分。它不仅是一部感人的亲情片,更是一部富含哲理的励志片和人生启示录。片中父亲的乐观豁达、幽默风趣的人生态度,对任何事都抱有积极热情的态度,即使在残酷无比的集中营里也是如此,深深地感染着观众。同时,这也是一部关于爱与希望的电影,让我们更加珍惜当下所拥有的一切,学会如何去爱身边的每一个人。

总之,这是一部值得一看再看的好电影,相信您一定会喜欢的!

French

<human>: Il y a t’il un petit-déjeuner de 700-1000 calories sans oeuf mais qui inclut des protéines.
<bot>: Oui, voici une recette de petit déjeuner à base d'avoine et de fromage blanc :
Ingrédients :
- 1 tasse (250 ml) de lait d'amande ou autre lait végétal
- 2 cuillères à soupe bombées (30 g) de flocons d'avoine
- 1/2 banane mûre écrasée
- 1/4 de pomme fraîche épluchée et coupée en dés
- 1 cuillère à soupe de beurre d'amande ou autre purée d'oléagineux
- 1 cuillère à café de cannelle moulue
- 1 pincée de sel
Préparation :
1. Dans une petite casserole, faites chauffer le lait jusqu'à ce qu'il soit chaud.
2. Ajoutez les flocons d'avoine, la banane, la pomme, le beurre d'amande, la cannelle et le sel dans le bol du mixeur et mélangez bien pour obtenir une pâte homogène.
3. Versez le mélange dans le lait chaud et remuez rapidement avec une spatule en bois pour éviter que les flocons d'avoine ne s'agglutinent au fond de la casserole.
4. Laissez mijoter sur feu doux pendant environ 5 minutes tout en continuant de remuer régulièrement.
5. Transférez ensuite le porridge dans un bol et servez immédiatement. Vous pouvez ajouter quelques fruits rouges frais pour décorer et apporter encore plus de gourmandise à votre petit déjeuner ! Bon appétit !

Hindi

<human>: दिल्ली से जयपुर कैसे जाये? बस से या कार से? दोनो के बारे मे विस्तार मे बताये.
<bot>: दिल्ली से जयपुर जाने का सबसे आसान और तेज़ तरीका है कि आप अपनी खुद की कार लेकर जाएं, क्योंकि यह आपको अपने समय पर निर्भर रहने की स्वतंत्रता देता है। हालांकि, यदि आपके पास अपनी कार नहीं है, तो आप दिल्ली से जयपुर तक एक निजी वाहन किराए पर ले सकते हैं। यदि आप सार्वजनिक परिवहन (बस) का उपयोग करना चाहते हैं, तो दिल्ली में कई सरकारी और निजी बस सेवाएं उपलब्ध हैं जो जयपुर को जोड़ती हैं। दिल्ली से जयपुर जाने वाली कुछ प्रमुख बसें हैं-
1. राजस्थान रोडवेज: ये राज्य सरकार द्वारा संचालित बसें हैं जो दिल्ली से जयपुर के बीच चलती हैं। ये नियमित रूप से चलती हैं और इनके लिए अग्रिम आरक्षण आवश्यक नहीं है।
2. ALSA: यह एक निजी कंपनी है जो दिल्ली से जयपुर के बीच नियमित रूप से चलती है। इसके टिकटों की बुकिंग ऑनलाइन और ऑफलाइन दोनों माध्यमों से की जा सकती है।
3. RedBus: यह एक ऑनलाइन बस बुकिंग वेबसाइट है जहां से आप दिल्ली से जयपुर के बीच चलनेवाली विभिन्न कंपनियों की बसों में सीट आरक्षित करा सकते हैं।
4. Ola and Uber: इन ऐप्स का उपयोग करके आप दिल्ली से जयपुर तक कैब राइड बुक कर सकते हैं, लेकिन ध्यान रखें कि यह विकल्प सस्ता नहीं होगा। दिल्ली से जयपुर जाने का सबसे अच्छा विकल्प आपकी व्यक्तिगत आवश्यकताओं, बजट, और उपलब्धता पर निर्भर करता है। यदि आप अपनी खुद की कार रखते हैं, तो इसका लाभ उठाएं अन्यथा किसी भी सार्वजनिक परिवहन साधन का चयन करें जो आपकी आवश्यकताओं को पूरा करे।

Spanish

<human>: ¿Me puedes dar algunas recomendaciones de juegos? Estoy buscando juegos de rol con un sistema de magia único, un sistema de construcción único y una buena historia. Por favor limítate a juegos que se puedan jugar en PC directamente, o por emulación.
<bot>: Claro! Aquí te dejo tres recomendaciones:
1. The Outer Worlds: es un juego de rol de ciencia ficción con un sistema de magia basado en la tecnología alienígena que permite al jugador personalizar sus habilidades para adaptarse a su estilo de juego.
2. Divinity: Original Sin 2 - Definitive Edition: este juego combina elementos de RPG con estrategia en tiempo real (RTS), lo que le da al jugador un control total sobre cómo construir y mejorar su personaje y el mundo que lo rodea.
3. Pillars of Eternity: esta aventura de fantasía oscura tiene lugar en un vasto mundo abierto donde los jugadores pueden elegir entre diferentes clases y desarrollar relaciones con otros personajes del juego. La mecánica de combate está basada en turnos y utiliza hechizos y armas cuerpo a cuerpo.
Estos son solo algunos ejemplos de juegos que podrían interesarte según tus criterios. Hay muchos más títulos disponibles en el mercado que podrían ajustarse a tu gusto, así que no dudes en explorar otras opciones si estos juegos no cumplen con tus expectativas.

Evaluation Graphs

Click to expand

Human evaluation

BLOOMChat vs Baselines Model in Human Preference Rankings

Human evaluation against GPT4

BLOOMChat vs GPT-4 in Human Preference Ranking

Multilingual evaluation

BLOOMChat surpasses other Bloom variants and state-of-the-art open-source chat models in translation tasks [NOTE: Evaluation of the BLOOM and BLOOMZ in WMT18 en->zh zh->en used (human, bot) ChatML tags due to an unintentional configuration. Results might be suboptimal.]


Training Details

Click to expand

Training Data

Training Procedure

We trained BLOOMChat with SambaStudio, a platform built on SambaNova's in-house Reconfigurable Dataflow Unit (RDU). We started from BLOOM, an OSS multilingual 176B GPT model pretrained by the BigScience group. There was also some preprocessing done on the training datasets.

Prompting Style Used For Training

<human>: {input that the user wants from the bot}
<bot>: 
<human>: {fewshot1 input}
<bot>: {fewshot1 response}
<human>: {fewshot2 input}
<bot>: {fewshot2 response}
<human>: {input that the user wants from the bot}
<bot>: 

Hyperparameters

Instruction-tuned Training on OIG

  • Hardware: SambaNova Reconfigurable Dataflow Unit (RDU)
  • Optimizer: AdamW
  • Grad accumulation: 1
  • Epochs: 1
  • Global Batch size: 128
  • Batch tokens: 128 * 2048 = 262,144 tokens
  • Learning Rate: 1e-5
  • Learning Rate Scheduler: Cosine Schedule with Warmup
  • Warmup Steps: 0
  • End Learning Ratio: 0.1
  • Weight decay: 0.1

Instruction-tuned Training on Dolly 2.0 and Oasst1

  • Hardware: SambaNova Reconfigurable Dataflow Unit (RDU)
  • Optimizer: AdamW
  • Grad accumulation: 1
  • Epochs: 3
  • Global Batch size: 128
  • Batch tokens: 128 * 2048 = 262,144 tokens
  • Learning Rate: 1e-5
  • Learning Rate Scheduler: Cosine Schedule with Warmup
  • Warmup Steps: 0
  • End Learning Ratio: 0.1
  • Weight decay: 0.1

RAIL Restrictions

As the original model BLOOM is on the RAIL License, we need to follow the same restrictions.

You agree not to use the Model or Derivatives of the Model:

  1. In any way that violates any applicable national, federal, state, local or international law or regulation;
  2. For the purpose of exploiting, harming or attempting to exploit or harm minors in any way;
  3. To generate or disseminate verifiably false information with the purpose of harming others;
  4. To generate or disseminate personal identifiable information that can be used to harm an individual;
  5. To generate or disseminate information or content, in any context (e.g. posts, articles, tweets, chatbots or other kinds of automated bots) without expressly and intelligibly disclaiming that the text is machine generated;
  6. To defame, disparage or otherwise harass others;
  7. To impersonate or attempt to impersonate others;
  8. For fully automated decision making that adversely impacts an individual’s legal rights or otherwise creates or modifies a binding, enforceable obligation;
  9. For any use intended to or which has the effect of discriminating against or harming individuals or groups based on online or offline social behavior or known or predicted personal or personality characteristics
  10. To exploit any of the vulnerabilities of a specific group of persons based on their age, social, physical or mental characteristics, in order to materially distort the behavior of a person pertaining to that group in a manner that causes or is likely to cause that person or another person physical or psychological harm;
  11. For any use intended to or which has the effect of discriminating against individuals or groups based on legally protected characteristics or categories;
  12. To provide medical advice and medical results interpretation;
  13. To generate or disseminate information for the purpose to be used for administration of justice, law enforcement, immigration or asylum processes, such as predicting an individual will commit fraud/crime commitment (e.g. by text profiling, drawing causal relationships between assertions made in documents, indiscriminate and arbitrarily-targeted use).

Bias, Risks, and Limitations

Like all LLMs, BLOOMChat has certain limitations:

  • Hallucination: BLOOMChat may sometimes generate responses that contain plausible-sounding but factually incorrect or irrelevant information.
  • Code Switching: The model might unintentionally switch between languages or dialects within a single response, affecting the coherence and understandability of the output.
  • Repetition: BLOOMChat may produce repetitive phrases or sentences, leading to less engaging and informative responses.
  • Coding and Math: The model's performance in generating accurate code or solving complex mathematical problems may be limited.
  • Toxicity: BLOOMChat may inadvertently generate responses containing inappropriate or harmful content.

Acknowledgment

We would like to extend our gratitude to Together for their insightful technical discussions on overall project planning, data processing, model training, human evaluation experiment design, open-source endeavors, and their contributions on data processing code on OpenChatKit, OASST1, and Dolly 2.0.

We are grateful to the various researchers and open-source projects that have contributed to the development of BLOOMChat. We thank BigScience for providing the BLOOM model, which served as the base for our instruction tuning. We also thank LAION for their OIG dataset, OpenAssistant Conversations Dataset (OASST1) and also thank Databricks for providing Dolly 2.0, to provide the dataset that we instruction tuned on.

We appreciate lm-eval-harness and BigScience for their essential benchmarking contributions, which is very helpful in evaluating BLOOMChat's performance. We appreciate the inspiration from the wave of various recent open-source chat models, including OpenAssistant-30B, LLaMA-Adapter-V2-65B, Vicuna-13b, Koala-13b, OASST-Pythia-12b, Alpaca-13b, ChatGLM-6b, FastChat-T5-3b, Dolly-v2-12b, LLaMA-13b, StableLM-Tuned-Alpha-7b, RedPajama-INCITE-Chat-7B-v0.1, RedPajama-INCITE-Chat-3B-v1, MPT-7B-Chat and so on. We look forward to witnessing the continued growth and success of open-source chat-based models.

We highly appreciate the hard work and dedication of these researchers and organizations towards the advancement of the open-source community. Their contributions were invaluable in the development of BLOOMChat, and we hope that our model can contribute to further advancements in the field.