HALU-HAL commited on
Commit
58193c1
·
1 Parent(s): 0370fa7

[docs] ドキュメントの修正

Browse files

- docs/page_front.mdのバッジのレイアウトを改善
- 各バッジを別の行に配置して見やすくした
- GitHub Tagバッジを追加

[feat] プレビュー機能の追加

app.pyに以下の機能を追加:

- ファイルツリーのプレビュー機能
- `preview_tree`チェックボックスを追加
- チェックが入っている場合、リポジトリのファイルツリーを表示

また、リポジトリのスキャン状況を示すステータスバーを常に展開するように変更。

[chore] .CodeLumiaignoreファイルの更新

.CodeLumiaignoreファイルに以下の拡張子を追加:

- *.zip
- *.svg
- *.jpeg

これにより、該当する拡張子のファイルがCodeLumiaの処理対象から除外されるようになった。

Files changed (4) hide show
  1. .CodeLumiaignore +5 -1
  2. DeepSeek-Math.md +0 -259
  3. app.py +4 -1
  4. docs/page_front.md +6 -2
.CodeLumiaignore CHANGED
@@ -171,4 +171,8 @@ LICENSE
171
  *.sqlite
172
  *.jpg
173
  requirements.txt
174
- LICENSE*
 
 
 
 
 
171
  *.sqlite
172
  *.jpg
173
  requirements.txt
174
+ LICENSE*
175
+ *.zip
176
+ environment.yml
177
+ *.svg
178
+ *.jpeg
DeepSeek-Math.md DELETED
@@ -1,259 +0,0 @@
1
- # << DeepSeek-Math>>
2
- ## DeepSeek-Math File Tree
3
-
4
- ```
5
- DeepSeek-Math/
6
- cog.yaml
7
- README.md
8
-
9
- ```
10
-
11
- ## cog.yaml
12
-
13
- ```yaml
14
- # Configuration for Cog ⚙️
15
- # Reference: https://github.com/replicate/cog/blob/main/docs/yaml.md
16
-
17
- build:
18
- gpu: true
19
- python_version: "3.11"
20
- python_packages:
21
- - torch==2.0.1
22
- - torchvision==0.15.2
23
- - transformers==4.37.2
24
- - accelerate==0.27.0
25
- - hf_transfer
26
-
27
- # predict.py defines how predictions are run on your model
28
- predict: "replicate/predict.py:Predictor"
29
-
30
- ```
31
-
32
- ## README.md
33
-
34
- ```markdown
35
-
36
- <!-- markdownlint-disable first-line-h1 -->
37
- <!-- markdownlint-disable html -->
38
- <!-- markdownlint-disable no-duplicate-header -->
39
-
40
- <div align="center">
41
- <img src="images/logo.svg" width="60%" alt="DeepSeek LLM" />
42
- </div>
43
- <hr>
44
- <div align="center">
45
-
46
- <a href="https://www.deepseek.com/" target="_blank">
47
- <img alt="Homepage" src="images/badge.svg" />
48
- </a>
49
- <a href="https://chat.deepseek.com/" target="_blank">
50
- <img alt="Chat" src="https://img.shields.io/badge/🤖%20Chat-DeepSeek%20LLM-536af5?color=536af5&logoColor=white" />
51
- </a>
52
- <a href="https://huggingface.co/deepseek-ai" target="_blank">
53
- <img alt="Hugging Face" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-DeepSeek%20AI-ffc107?color=ffc107&logoColor=white" />
54
- </a>
55
- <a href="https://replicate.com/cjwbw/deepseek-math-7b-base" target="_parent"><img src="https://replicate.com/cjwbw/deepseek-math-7b-base/badge" alt="Replicate"/></a>
56
- </div>
57
-
58
- <div align="center">
59
-
60
- <a href="https://discord.gg/Tc7c45Zzu5" target="_blank">
61
- <img alt="Discord" src="https://img.shields.io/badge/Discord-DeepSeek%20AI-7289da?logo=discord&logoColor=white&color=7289da" />
62
- </a>
63
- <a href="images/qr.jpeg" target="_blank">
64
- <img alt="Wechat" src="https://img.shields.io/badge/WeChat-DeepSeek%20AI-brightgreen?logo=wechat&logoColor=white" />
65
- </a>
66
- <a href="https://twitter.com/deepseek_ai" target="_blank">
67
- <img alt="Twitter Follow" src="https://img.shields.io/badge/Twitter-deepseek_ai-white?logo=x&logoColor=white" />
68
- </a>
69
-
70
- </div>
71
-
72
- <div align="center">
73
-
74
- <a href="LICENSE-CODE">
75
- <img alt="Code License" src="https://img.shields.io/badge/Code_License-MIT-f5de53?&color=f5de53">
76
- </a>
77
- <a href="LICENSE-MODEL">
78
- <img alt="Model License" src="https://img.shields.io/badge/Model_License-Model_Agreement-f5de53?&color=f5de53">
79
- </a>
80
- </div>
81
-
82
-
83
- <p align="center">
84
- <a href="#4-model-downloads">Model Download</a> |
85
- <a href="#2-evaluation-results">Evaluation Results</a> |
86
- <a href="#5-quick-start">Quick Start</a> |
87
- <a href="#6-license">License</a> |
88
- <a href="#7-citation">Citation</a>
89
- </p>
90
-
91
- <p align="center">
92
- <a href="https://arxiv.org/pdf/2402.03300.pdf"><b>Paper Link</b>👁️</a>
93
- </p>
94
-
95
-
96
- ## 1. Introduction
97
-
98
- DeepSeekMath is initialized with [DeepSeek-Coder-v1.5 7B](https://huggingface.co/deepseek-ai/deepseek-coder-7b-base-v1.5) and continues pre-training on math-related tokens sourced from Common Crawl, together with natural language and code data for 500B tokens. DeepSeekMath 7B has achieved an impressive score of **51.7%** on the competition-level MATH benchmark without relying on external toolkits and voting techniques, approaching the performance level of Gemini-Ultra and GPT-4. For research purposes, we release [checkpoints](#4-model-downloads) of base, instruct, and RL models to the public.
99
-
100
- <p align="center">
101
- <img src="images/math.png" alt="table" width="70%">
102
- </p>
103
-
104
- ## 2. Evaluation Results
105
-
106
- ### DeepSeekMath-Base 7B
107
-
108
- We conduct a comprehensive assessment of the mathematical capabilities of DeepSeekMath-Base 7B, focusing on its ability to produce self-contained mathematical solutions without relying on external tools, solve math problems using tools, and conduct formal theorem proving. Beyond mathematics, we also provide a more general profile of the base model, including its performance of natural language understanding, reasoning, and programming skills.
109
-
110
- - **Mathematical problem solving with step-by-step reasoning**
111
-
112
- <p align="center">
113
- <img src="images/base_results_1.png" alt="table" width="70%">
114
- </p>
115
-
116
- - **Mathematical problem solving with tool use**
117
-
118
- <p align="center">
119
- <img src="images/base_results_2.png" alt="table" width="50%">
120
- </p>
121
-
122
- - **Natural Language Understanding, Reasoning, and Code**
123
- <p align="center">
124
- <img src="images/base_results_3.png" alt="table" width="50%">
125
- </p>
126
-
127
- The evaluation results from the tables above can be summarized as follows:
128
- - **Superior Mathematical Reasoning:** On the competition-level MATH dataset, DeepSeekMath-Base 7B outperforms existing open-source base models by more than 10% in absolute terms through few-shot chain-of-thought prompting, and also surpasses Minerva 540B.
129
- - **Strong Tool Use Ability:** Continuing pre-training with DeepSeekCoder-Base-7B-v1.5 enables DeepSeekMath-Base 7B to more effectively solve and prove mathematical problems by writing programs.
130
- - **Comparable Reasoning and Coding Performance:** DeepSeekMath-Base 7B achieves performance in reasoning and coding that is comparable to that of DeepSeekCoder-Base-7B-v1.5.
131
-
132
- ### DeepSeekMath-Instruct and -RL 7B
133
-
134
- DeepSeekMath-Instruct 7B is a mathematically instructed tuning model derived from DeepSeekMath-Base 7B, while DeepSeekMath-RL 7B is trained on the foundation of DeepSeekMath-Instruct 7B, utilizing our proposed Group Relative Policy Optimization (GRPO) algorithm.
135
-
136
- We evaluate mathematical performance both without and with tool use, on 4 quantitative reasoning benchmarks in English and Chinese. As shown in Table, DeepSeekMath-Instruct 7B demonstrates strong performance of step-by-step reasoning, and DeepSeekMath-RL 7B approaches an accuracy of 60% on MATH with tool use, surpassing all existing open-source models.
137
-
138
- <p align="center">
139
- <img src="images/instruct_results.png" alt="table" width="50%">
140
- </p>
141
-
142
-
143
- ## 3. Data Collection
144
-
145
- - Step 1: Select [OpenWebMath](https://arxiv.org/pdf/2310.06786.pdf), a collection of high-quality mathematical web texts, as our initial seed corpus for training a FastText model.
146
- - Step 2: Use the FastText model to retrieve mathematical web pages from the deduplicated Common Crawl database.
147
- - Step 3: Identify potential math-related domains through statistical analysis.
148
- - Step 4: Manually annotate URLs within these identified domains that are associated with mathematical content.
149
- - Step 5: Add web pages linked to these annotated URLs, but not yet collected, to the seed corpus. Jump to step 1 until four iterations.
150
-
151
-
152
- <p align="center">
153
- <img src="images/data_pipeline.png" alt="table" width="80%">
154
- </p>
155
-
156
- After four iterations of data collection, we end up with **35.5M** mathematical web pages, totaling **120B** tokens.
157
-
158
- ## 4. Model Downloads
159
-
160
- We release the DeepSeekMath 7B, including base, instruct and RL models, to the public. To support a broader and more diverse range of research within both academic and commercial communities. Please **note** that the use of this model is subject to the terms outlined in [License section](#6-license). Commercial usage is permitted under these terms.
161
-
162
- ### Huggingface
163
-
164
- | Model | Sequence Length | Download |
165
- | :----------------------- | :-------------: | :----------------------------------------------------------: |
166
- | DeepSeekMath-Base 7B | 4096 | 🤗 [HuggingFace](https://huggingface.co/deepseek-ai/deepseek-math-7b-base) |
167
- | DeepSeekMath-Instruct 7B | 4096 | 🤗 [HuggingFace](https://huggingface.co/deepseek-ai/deepseek-math-7b-instruct) |
168
- | DeepSeekMath-RL 7B | 4096 | 🤗 [HuggingFace](https://huggingface.co/deepseek-ai/deepseek-math-7b-rl) |
169
-
170
- ## 5. Quick Start
171
-
172
- You can directly employ [Huggingface's Transformers](https://github.com/huggingface/transformers) for model inference.
173
-
174
- **Text Completion**
175
-
176
- ```python
177
- import torch
178
- from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig
179
-
180
- model_name = "deepseek-ai/deepseek-math-7b-base"
181
- tokenizer = AutoTokenizer.from_pretrained(model_name)
182
- model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, device_map="auto")
183
- model.generation_config = GenerationConfig.from_pretrained(model_name)
184
- model.generation_config.pad_token_id = model.generation_config.eos_token_id
185
-
186
- text = "The integral of x^2 from 0 to 2 is"
187
- inputs = tokenizer(text, return_tensors="pt")
188
- outputs = model.generate(**inputs.to(model.device), max_new_tokens=100)
189
-
190
- result = tokenizer.decode(outputs[0], skip_special_tokens=True)
191
- print(result)
192
- ```
193
-
194
- **Chat Completion**
195
-
196
- ```python
197
- import torch
198
- from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig
199
-
200
- model_name = "deepseek-ai/deepseek-math-7b-instruct"
201
- tokenizer = AutoTokenizer.from_pretrained(model_name)
202
- model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, device_map="auto")
203
- model.generation_config = GenerationConfig.from_pretrained(model_name)
204
- model.generation_config.pad_token_id = model.generation_config.eos_token_id
205
-
206
- messages = [
207
- {"role": "user", "content": "what is the integral of x^2 from 0 to 2?\nPlease reason step by step, and put your final answer within \boxed{}."}
208
- ]
209
- input_tensor = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt")
210
- outputs = model.generate(input_tensor.to(model.device), max_new_tokens=100)
211
-
212
- result = tokenizer.decode(outputs[0][input_tensor.shape[1]:], skip_special_tokens=True)
213
- print(result)
214
- ```
215
-
216
- Avoiding the use of the provided function `apply_chat_template`, you can also interact with our model following the sample template. Note that `messages` should be replaced by your input.
217
-
218
- ```
219
- User: {messages[0]['content']}
220
-
221
- Assistant: {messages[1]['content']}<|end▁of▁sentence|>User: {messages[2]['content']}
222
-
223
- Assistant:
224
- ```
225
-
226
- **Note:** By default (`add_special_tokens=True`), our tokenizer automatically adds a `bos_token` (`<|begin▁of▁sentence|>`) before the input text. Additionally, since the system prompt is not compatible with this version of our models, we DO NOT RECOMMEND including the system prompt in your input.
227
-
228
- ❗❗❗ **Please use chain-of-thought prompt to test DeepSeekMath-Instruct and DeepSeekMath-RL:**
229
-
230
- - English questions: **{question}\nPlease reason step by step, and put your final answer within \\boxed{}.**
231
-
232
- - Chinese questions: **{question}\n请通过逐步推理来解答问题,并把最终答案放置于\\boxed{}中。**
233
-
234
-
235
- ## 6. License
236
- This code repository is licensed under the MIT License. The use of DeepSeekMath models is subject to the Model License. DeepSeekMath supports commercial use.
237
-
238
- See the [LICENSE-CODE](LICENSE-CODE) and [LICENSE-MODEL](LICENSE-MODEL) for more details.
239
-
240
- ## 7. Citation
241
-
242
- ```
243
- @misc{deepseek-math,
244
- author = {Zhihong Shao, Peiyi Wang, Qihao Zhu, Runxin Xu, Junxiao Song, Mingchuan Zhang, Y.K. Li, Y. Wu, Daya Guo},
245
- title = {DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models},
246
- journal = {CoRR},
247
- volume = {abs/2402.03300},
248
- year = {2024},
249
- url = {https://arxiv.org/abs/2402.03300},
250
- }
251
- ```
252
-
253
-
254
- ## 8. Contact
255
-
256
- If you have any questions, please raise an issue or contact us at [[email protected]](mailto:[email protected]).
257
-
258
- ```
259
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
app.py CHANGED
@@ -36,11 +36,12 @@ max_depth = st.sidebar.number_input("探索の最大深度:", min_value=1, value
36
 
37
  preview_markdown = st.sidebar.checkbox('preview markdown', value=False)
38
  preview_plaintext = st.sidebar.checkbox('preview plaintext', value=False)
 
39
 
40
  if st.button("CodeLumia Run ...", type="primary"):
41
  if repo_url:
42
  repo_name = repo_url.split("/")[-1].split(".")[0]
43
- with st.status("Scaning repository...", expanded=False):
44
  st.write("clone repository...")
45
  repo_path = clone_repository(repo_url, repo_name, tmp_dir=tmp_dir)
46
  st.write("get file tree...")
@@ -50,6 +51,8 @@ if st.button("CodeLumia Run ...", type="primary"):
50
 
51
  # マークダウンファイルを保存
52
  save_markdown_file(repo_name, markdown_content)
 
 
53
 
54
  # Streamlitアプリケーションの構築
55
  if(preview_markdown):
 
36
 
37
  preview_markdown = st.sidebar.checkbox('preview markdown', value=False)
38
  preview_plaintext = st.sidebar.checkbox('preview plaintext', value=False)
39
+ preview_tree = st.sidebar.checkbox('preview tree', value=True)
40
 
41
  if st.button("CodeLumia Run ...", type="primary"):
42
  if repo_url:
43
  repo_name = repo_url.split("/")[-1].split(".")[0]
44
+ with st.status("Scaning repository...", expanded=True):
45
  st.write("clone repository...")
46
  repo_path = clone_repository(repo_url, repo_name, tmp_dir=tmp_dir)
47
  st.write("get file tree...")
 
51
 
52
  # マークダウンファイルを保存
53
  save_markdown_file(repo_name, markdown_content)
54
+ if(preview_tree):
55
+ st.code(f"{file_tree}")
56
 
57
  # Streamlitアプリケーションの構築
58
  if(preview_markdown):
docs/page_front.md CHANGED
@@ -5,8 +5,12 @@
5
  <h3 align="center">
6
  ~Learn to Code, Step by Step~
7
 
8
- [![Hugging Face Spaces](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/OFA-Sys/OFA-Image_Caption)[![](https://img.shields.io/github/stars/Sunwood-ai-labs/CodeLumia)](https://github.com/Sunwood-ai-labs/CodeLumia)[![](https://img.shields.io/github/last-commit/Sunwood-ai-labs/CodeLumia)](https://github.com/Sunwood-ai-labs/CodeLumia)[![](https://img.shields.io/github/languages/top/Sunwood-ai-labs/CodeLumia)](https://github.com/Sunwood-ai-labs/CodeLumia)[![GitHub Release](https://img.shields.io/github/v/release/Sunwood-ai-labs/CodeLumia?sort=date&color=red)
9
- ](https://github.com/Sunwood-ai-labs/CodeLumia)
 
 
 
 
10
 
11
  </h3>
12
 
 
5
  <h3 align="center">
6
  ~Learn to Code, Step by Step~
7
 
8
+ [![Hugging Face Spaces](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/OFA-Sys/OFA-Image_Caption)
9
+ [![](https://img.shields.io/github/stars/Sunwood-ai-labs/CodeLumia)](https://github.com/Sunwood-ai-labs/CodeLumia)
10
+ [![](https://img.shields.io/github/last-commit/Sunwood-ai-labs/CodeLumia)](https://github.com/Sunwood-ai-labs/CodeLumia)
11
+ [![](https://img.shields.io/github/languages/top/Sunwood-ai-labs/CodeLumia)](https://github.com/Sunwood-ai-labs/CodeLumia)
12
+ [![GitHub Release](https://img.shields.io/github/v/release/Sunwood-ai-labs/CodeLumia?sort=date&color=red)](https://github.com/Sunwood-ai-labs/CodeLumia)
13
+ [![GitHub Tag](https://img.shields.io/github/v/tag/Sunwood-ai-labs/CodeLumia?color=orange)](https://github.com/Sunwood-ai-labs/CodeLumia)
14
 
15
  </h3>
16