Files changed (2) hide show
  1. README.md +17 -59
  2. config.json +2 -2
README.md CHANGED
@@ -1,30 +1,18 @@
1
  ---
2
  license: apache-2.0
3
- pipeline_tag: text-generation
4
- datasets:
5
- - aiplanet/buddhi-dataset
6
- language:
7
- - en
8
  ---
9
 
10
- <p align="center" style="font-size:34px;"><b>Buddhi-128K-Chat</b></p>
11
 
12
- # Buddhi-128K-Chat (7B) vLLM Inference: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/11_8W8FpKK-856QdRVJLyzbu9g-DMxNfg?usp=sharing)
13
 
14
- # Read release article: [πŸ”— Introducing Buddhi: Open-Source Chat Model with a 128K Context Window πŸ”— ](https://medium.aiplanet.com/introducing-buddhi-open-source-chat-model-with-a-128k-context-window-06a1848121d0)
15
 
16
- ![4.png](https://cdn-uploads.huggingface.co/production/uploads/630f3058236215d0b7078806/VUY0c4xOGpH9jTNmf6XNU.png)
17
 
18
- ## Model Description
19
-
20
- Buddhi-128k-Chat is a general-purpose first chat model with 128K context length window. It is meticulously fine-tuned on the Mistral 7B Instruct, and optimised to handle an extended context length of up to 128,000 tokens using the innovative YaRN (Yet another Rope Extension) Technique. This enhancement allows Buddhi to maintain a deeper understanding of context in long documents or conversations, making it particularly adept at tasks requiring extensive context retention, such as comprehensive document summarization, detailed narrative generation, and intricate question-answering.
21
 
22
  ## Architecture
23
- The Buddhi-128K-Chat model is fine-tuned on the Mistral-7B Instruct base model. We selected the Mistral 7B Instruct v0.2 as the parent model due to its superior reasoning capabilities. The architecture of the Mistral-7B model includes features like Grouped-Query Attention and Byte-fallback BPE tokenizer. Originally, this model has 32,768 maximum position embeddings. To increase the context size to 128K, we needed to modify the positional embeddings, which is where YaRN comes into play.
24
-
25
- In our approach, we utilized the NTK-aware technique, which recommends alternative interpolation techniques for positional interpolation. One experimentation involved Dynamic-YARN, suggesting the dynamic value of the 's' scale factor. This is because during inference, the sequence length changes by 1 after every word prediction. By integrating these position embeddings with the Mistral-7B Instruct base model, we achieved the 128K model.
26
-
27
- Additionally, we fine-tuned the model on our dataset to contribute one of the very few 128K chat-based models available in the open-source community with greater reasoning capabilities than all of it.
28
 
29
  ### Hardware requirements:
30
  > For 128k Context Length
@@ -45,17 +33,13 @@ Please check out [Flash Attention 2](https://github.com/Dao-AILab/flash-attentio
45
 
46
  **Implementation**:
47
 
48
- > Note: The actual hardware requirements to run the model is roughly around 70GB VRAM. For experimentation, we are limiting the context length to 75K instead of 128K. This make it suitable for testing the model in 30-35 GB VRAM
49
-
50
  ```python
51
  from vllm import LLM, SamplingParams
52
 
53
  llm = LLM(
54
- model='aiplanet/buddhi-128k-chat-7b',
55
- trust_remote_code=True,
56
- dtype = 'bfloat16',
57
- gpu_memory_utilization=1,
58
- max_model_len= 75000
59
  )
60
 
61
  prompts = [
@@ -76,12 +60,8 @@ for output in outputs:
76
  generated_text = output.outputs[0].text
77
  print(generated_text)
78
  print("\n\n")
79
-
80
- # we have also attached a colab notebook, that contains: 2 more experimentations: Long Essay and Entire Book
81
  ```
82
 
83
- For Output, do check out the colab notebook: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/11_8W8FpKK-856QdRVJLyzbu9g-DMxNfg?usp=sharing)
84
-
85
  ### Transformers - Basic Implementation
86
 
87
  ```python
@@ -134,8 +114,7 @@ Why don't scientists trust atoms?
134
  Because they make up everything.
135
  ```
136
 
137
-
138
- ## Prompt Template for Buddi-128-Chat
139
 
140
  In order to leverage instruction fine-tuning, your prompt should be surrounded by [INST] and [/INST] tokens. The very first instruction should begin with a begin of sentence id. The next instructions should not. The assistant generation will be ended by the end-of-sentence token id.
141
 
@@ -145,39 +124,18 @@ In order to leverage instruction fine-tuning, your prompt should be surrounded b
145
  "[INST] Do you have mayonnaise recipes? [/INST]"
146
 
147
  ```
 
148
 
149
- # Benchmarks
150
-
151
- ### Long Context Benchmark
152
 
153
- <strong>LongICLBench Banking77</strong>
154
- <div>
155
-
156
- | Model | 1R/2k | 2R/4K | 3R/7K | 4R/9K | 5R/14K |
157
- |-----------------------------------------|-------|-------|-------|-------|--------|
158
- | aiplanet/buddhi-128k-chat-7b | 47.8 | 60.8 | 57.8 | 62.4 | 57.2 |
159
- | NousResearch/Yarn-Mistral-7b-128k | 31.6 | 68.6 | 68 | 47 | 65.6 |
160
- | CallComply/zephyr-7b-beta-128k | 40.2 | 41.2 | 33.6 | 03 | 0 |
161
- | Eric111/Yarn-Mistral-7b-128k-DPO | 28.6 | 62.8 | 58 | 41.6 | 59.8 |
162
 
163
- </div>
164
 
165
- <strong>Short Context Benchmark</strong>
166
- <div>
167
-
168
- | Model | # Params | Average | ARC (25-shot) | HellaSwag (10-shot) | Winogrande (5-shot) | TruthfulOA (0-shot) | MMLU (5-shot) |
169
- |-----------------------------------|----------|---------|---------------|---------------------|---------------------|---------------------|---------------|
170
- | aiplanet/buddhi-128k-chat-7b | 7B | 64.42 | 60.84 | 84 | 77.27 | 65.72 | 60.42 |
171
- | migtissera/Tess-XS-vl-3-yarn-128K | 7B | 62.66 | 61.09 | 82.95 | 74.43 | 50.13 | 62.15 |
172
- | migtissera/Tess-XS-v1-3-yarn-128K | 7B | 62.49 | 61.6 | 82.96 | 74.74 | 50.2 | 62.1 |
173
- | Eric111/Yarn-Mistral-7b-128k-DPO | 7B | 60.15 | 60.84 | 82.99 | 78.3 | 43.55 | 63.09 |
174
- | NousResearch/Yam-Mistral-7b-128k | 7B | 59.42 | 59.64 | 82.5 | 76.95 | 41.78 | 63.02 |
175
- | CallComply/openchat-3.5-0106-128k | 7B | 59.38 | 64.25 | 77.31 | 77.66 | 46.5 | 57.58 |
176
- | CallComply/zephyr-7b-beta-128k | 7B | 54.45 | 58.28 | 81 | 74.74 | 46.1 | 53.57 |
177
 
178
- </div>
179
 
180
- ## Get in Touch
181
 
182
  You can schedule a 1:1 meeting with our DevRel & Community Team to get started with AI Planet Open Source LLMs and GenAI Stack. Schedule the call here: [https://calendly.com/jaintarun](https://calendly.com/jaintarun)
183
 
@@ -195,8 +153,8 @@ In order to leverage instruction fine-tuning, your prompt should be surrounded b
195
  ### Citation
196
 
197
  ```
198
- @misc {Chaitanya890, lucifertrj ,
199
- author = { Chaitanya Singhal, Tarun Jain },
200
  title = { Buddhi-128k-Chat by AI Planet},
201
  year = 2024,
202
  url = { https://huggingface.co/aiplanet//Buddhi-128K-Chat },
 
1
  ---
2
  license: apache-2.0
 
 
 
 
 
3
  ---
4
 
5
+ <p align="center" style="font-size:34px;"><b>Buddhi 7B</b></p>
6
 
7
+ # Buddhi-7B vLLM Inference: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/11_8W8FpKK-856QdRVJLyzbu9g-DMxNfg?usp=sharing)
8
 
9
+ # Model Description
10
 
11
+ <!-- Provide a quick summary of what the model is/does. -->
12
 
13
+ Buddhi is a general-purpose chat model, meticulously fine-tuned on the Mistral 7B Instruct, and optimised to handle an extended context length of up to 128,000 tokens using the innovative YaRN [(Yet another Rope Extension)](https://arxiv.org/abs/2309.00071) Technique. This enhancement allows Buddhi to maintain a deeper understanding of context in long documents or conversations, making it particularly adept at tasks requiring extensive context retention, such as comprehensive document summarization, detailed narrative generation, and intricate question-answering.
 
 
14
 
15
  ## Architecture
 
 
 
 
 
16
 
17
  ### Hardware requirements:
18
  > For 128k Context Length
 
33
 
34
  **Implementation**:
35
 
 
 
36
  ```python
37
  from vllm import LLM, SamplingParams
38
 
39
  llm = LLM(
40
+ model='aiplanet/Buddhi-128K-Chat',
41
+ gpu_memory_utilization=0.99,
42
+ max_model_len=131072
 
 
43
  )
44
 
45
  prompts = [
 
60
  generated_text = output.outputs[0].text
61
  print(generated_text)
62
  print("\n\n")
 
 
63
  ```
64
 
 
 
65
  ### Transformers - Basic Implementation
66
 
67
  ```python
 
114
  Because they make up everything.
115
  ```
116
 
117
+ ## Prompt Template for Panda Coder 13B
 
118
 
119
  In order to leverage instruction fine-tuning, your prompt should be surrounded by [INST] and [/INST] tokens. The very first instruction should begin with a begin of sentence id. The next instructions should not. The assistant generation will be ended by the end-of-sentence token id.
120
 
 
124
  "[INST] Do you have mayonnaise recipes? [/INST]"
125
 
126
  ```
127
+ ## πŸ”— Key Features:
128
 
129
+ 🎯 Precision and Efficiency: The model is tailored for accuracy, ensuring your code is not just functional but also efficient.
 
 
130
 
131
+ ✨ Unleash Creativity: Whether you're a novice or an expert coder, Panda-Coder is here to support your coding journey, offering creative solutions to your programming challenges.
 
 
 
 
 
 
 
 
132
 
133
+ πŸ“š Evol Instruct Code: It's built on the robust Evol Instruct Code 80k-v1 dataset, guaranteeing top-notch code generation.
134
 
135
+ πŸ“’ What's Next?: We believe in continuous improvement and are excited to announce that in our next release, Panda-Coder will be enhanced with a custom dataset. This dataset will not only expand the language support but also include hardware programming languages like MATLAB, Embedded C, and Verilog. πŸ§°πŸ’‘
 
 
 
 
 
 
 
 
 
 
 
136
 
 
137
 
138
+ ## Get in Touch
139
 
140
  You can schedule a 1:1 meeting with our DevRel & Community Team to get started with AI Planet Open Source LLMs and GenAI Stack. Schedule the call here: [https://calendly.com/jaintarun](https://calendly.com/jaintarun)
141
 
 
153
  ### Citation
154
 
155
  ```
156
+ @misc {Chaitanya890,
157
+ author = { {Chaitanya Singhal} },
158
  title = { Buddhi-128k-Chat by AI Planet},
159
  year = 2024,
160
  url = { https://huggingface.co/aiplanet//Buddhi-128K-Chat },
config.json CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "_name_or_path": "aiplanet/buddhi-128k-chat-7b",
3
  "architectures": [
4
  "MistralForCausalLM"
5
  ],
@@ -25,7 +25,7 @@
25
  "factor": 4.0,
26
  "finetuned": true,
27
  "original_max_position_embeddings": 32768,
28
- "type": "yarn"
29
  },
30
  "rope_theta": 1000000.0,
31
  "sliding_window": null,
 
1
  {
2
+ "_name_or_path": "aiplanet/Buddhi-128K-Chat",
3
  "architectures": [
4
  "MistralForCausalLM"
5
  ],
 
25
  "factor": 4.0,
26
  "finetuned": true,
27
  "original_max_position_embeddings": 32768,
28
+ "type": "dynamic-yarn"
29
  },
30
  "rope_theta": 1000000.0,
31
  "sliding_window": null,