K00B404 commited on
Commit
7f5b650
1 Parent(s): d6f1c10

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +248 -23
README.md CHANGED
@@ -1,23 +1,248 @@
1
- ---
2
- language:
3
- - en
4
- license: wtfpl
5
- tags:
6
- - python
7
- - coder
8
- datasets:
9
- - flytech/python-codes-25k
10
- - espejelomar/code_search_net_python_10000_examples
11
- #!Work in progress!#
12
- base model:
13
- - gpt2-medium coding model
14
- fine tuned:
15
- - Epochs 3
16
- - flytech/python-codes-25k (4600)
17
- - Training Loss: 0.4007
18
- - Validation Loss: 0.5526
19
-
20
- - Epochs 3
21
- - espejelomar/code_search_net_python_10000_examples (4800)
22
- - Training Loss: 1.5355
23
- - Validation Loss: 1.1723
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Model Card for GPT_2_CODE
2
+
3
+ <!-- Provide a quick summary of what the model is/does. [Optional] -->
4
+ WIP,
5
+ Goal is to create a small GPT2 python coder
6
+
7
+
8
+
9
+
10
+ # Table of Contents
11
+
12
+ - [Model Card for GPT_2_CODE](#model-card-for--model_id-)
13
+ - [Table of Contents](#table-of-contents)
14
+ - [Table of Contents](#table-of-contents-1)
15
+ - [Model Details](#model-details)
16
+ - [Model Description](#model-description)
17
+ - [Uses](#uses)
18
+ - [Direct Use](#direct-use)
19
+ - [Downstream Use [Optional]](#downstream-use-optional)
20
+ - [Out-of-Scope Use](#out-of-scope-use)
21
+ - [Bias, Risks, and Limitations](#bias-risks-and-limitations)
22
+ - [Recommendations](#recommendations)
23
+ - [Training Details](#training-details)
24
+ - [Training Data](#training-data)
25
+ - [Training Procedure](#training-procedure)
26
+ - [Preprocessing](#preprocessing)
27
+ - [Speeds, Sizes, Times](#speeds-sizes-times)
28
+ - [Evaluation](#evaluation)
29
+ - [Testing Data, Factors & Metrics](#testing-data-factors--metrics)
30
+ - [Testing Data](#testing-data)
31
+ - [Factors](#factors)
32
+ - [Metrics](#metrics)
33
+ - [Results](#results)
34
+ - [Model Examination](#model-examination)
35
+ - [Environmental Impact](#environmental-impact)
36
+ - [Technical Specifications [optional]](#technical-specifications-optional)
37
+ - [Model Architecture and Objective](#model-architecture-and-objective)
38
+ - [Compute Infrastructure](#compute-infrastructure)
39
+ - [Hardware](#hardware)
40
+ - [Software](#software)
41
+ - [Citation](#citation)
42
+ - [Glossary [optional]](#glossary-optional)
43
+ - [More Information [optional]](#more-information-optional)
44
+ - [Model Card Authors [optional]](#model-card-authors-optional)
45
+ - [Model Card Contact](#model-card-contact)
46
+ - [How to Get Started with the Model](#how-to-get-started-with-the-model)
47
+
48
+
49
+ # Model Details
50
+
51
+ ## Model Description
52
+
53
+ <!-- Provide a longer summary of what this model is/does. -->
54
+ WIP,
55
+ Goal is to create a small GPT2 python coder
56
+
57
+ - **Developed by:** C, o, d, e, M, o, n, k, e, y
58
+ - **Shared by [Optional]:** More information needed
59
+ - **Model type:** Language model
60
+ - **Language(s) (NLP):** eng
61
+ - **License:** wtfpl
62
+ - **Parent Model:** More information needed
63
+ - **Resources for more information:** More information needed
64
+ - [GitHub Repo](None)
65
+ - [Associated Paper](None)
66
+
67
+ # Uses
68
+
69
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
70
+
71
+ ## Direct Use
72
+
73
+ <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
74
+ <!-- If the user enters content, print that. If not, but they enter a task in the list, use that. If neither, say "more info needed." -->
75
+
76
+ generate python code snippets
77
+
78
+
79
+ ## Downstream Use [Optional]
80
+
81
+ <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
82
+ <!-- If the user enters content, print that. If not, but they enter a task in the list, use that. If neither, say "more info needed." -->
83
+
84
+ semi auto coder
85
+
86
+
87
+ ## Out-of-Scope Use
88
+
89
+ <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
90
+ <!-- If the user enters content, print that. If not, but they enter a task in the list, use that. If neither, say "more info needed." -->
91
+
92
+ describe code
93
+
94
+
95
+ # Bias, Risks, and Limitations
96
+
97
+ <!-- This section is meant to convey both technical and sociotechnical limitations. -->
98
+
99
+ Significant research has explored bias and fairness issues with language models (see, e.g., [Sheng et al. (2021)](https://aclanthology.org/2021.acl-long.330.pdf) and [Bender et al. (2021)](https://dl.acm.org/doi/pdf/10.1145/3442188.3445922)). Predictions generated by the model may include disturbing and harmful stereotypes across protected classes; identity characteristics; and sensitive, social, and occupational groups.
100
+
101
+
102
+ ## Recommendations
103
+
104
+ <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
105
+
106
+
107
+ Keep Finetuning on question/python datasets
108
+
109
+
110
+ # Training Details
111
+
112
+ ## Training Data
113
+
114
+ <!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
115
+
116
+ flytech/python-codes-25k
117
+ espejelomar/code_search_net_python_10000_examples
118
+
119
+
120
+ ## Training Procedure
121
+
122
+ <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
123
+
124
+ ### Preprocessing
125
+
126
+ More information needed
127
+
128
+ ### Speeds, Sizes, Times
129
+
130
+ <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
131
+
132
+ Epochs 3
133
+ flytech/python-codes-25k (4600)
134
+ Training Loss: 0.4007
135
+ Validation Loss: 0.5526
136
+
137
+ Epochs 3
138
+ espejelomar/code_search_net_python_10000_examples (4800)
139
+ Training Loss: 1.5355
140
+ Validation Loss: 1.1723
141
+
142
+ # Evaluation
143
+
144
+ <!-- This section describes the evaluation protocols and provides the results. -->
145
+
146
+ ## Testing Data, Factors & Metrics
147
+
148
+ ### Testing Data
149
+
150
+ <!-- This should link to a Data Card if possible. -->
151
+
152
+ flytech/python-codes-25k
153
+ espejelomar/code_search_net_python_10000_examples
154
+
155
+
156
+ ### Factors
157
+
158
+ <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
159
+
160
+ 80/20 train/val
161
+
162
+ ### Metrics
163
+
164
+ <!-- These are the evaluation metrics being used, ideally with a description of why. -->
165
+
166
+ train/validate
167
+ lr scheduling
168
+
169
+ ## Results
170
+
171
+ Better in python code generation as base gpt2-medium model
172
+
173
+ # Model Examination
174
+
175
+ More information needed
176
+
177
+ # Environmental Impact
178
+
179
+ <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
180
+
181
+ Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
182
+
183
+ - **Hardware Type:** CPU and Colab T4
184
+ - **Hours used:** 4
185
+ - **Cloud Provider:** Google Colab
186
+ - **Compute Region:** NL
187
+ - **Carbon Emitted:** ???
188
+
189
+ # Technical Specifications [optional]
190
+
191
+ ## Model Architecture and Objective
192
+
193
+ gpt2
194
+
195
+ ## Compute Infrastructure
196
+
197
+ More information needed
198
+
199
+ ### Hardware
200
+
201
+ CPU and Colab T4
202
+
203
+ ### Software
204
+
205
+ pytorch, custom python
206
+
207
+ # Citation
208
+
209
+ <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
210
+
211
+ **BibTeX:**
212
+
213
+ More information needed
214
+
215
+ **APA:**
216
+
217
+ More information needed
218
+
219
+ # Glossary [optional]
220
+
221
+ <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
222
+
223
+ More information needed
224
+
225
+ # More Information [optional]
226
+
227
+ Experimental
228
+
229
+ # Model Card Authors [optional]
230
+
231
+ <!-- This section provides another layer of transparency and accountability. Whose views is this model card representing? How many voices were included in its construction? Etc. -->
232
+
233
+ CodeMonkeyXL
234
+
235
+ # Model Card Contact
236
+
237
+ K00B404 huggingface
238
+
239
+ # How to Get Started with the Model
240
+
241
+ Use the code below to get started with the model.
242
+
243
+ <details>
244
+ <summary> Click to expand </summary>
245
+
246
+ More information needed
247
+
248
+ </details>