update
Browse files- evaluation/intro.txt +2 -2
evaluation/intro.txt
CHANGED
@@ -16,8 +16,8 @@ In most papers, 200 candidate program completions are sampled, and pass@1, pass@
|
|
16 |
|GPT-neo (1.5B)| 4.79% | 7.47% | 16.30% |
|
17 |
|GPT-J (6B)| 11.62% | 15.74% | 27.74% |
|
18 |
|
19 |
-
<br/>
|
20 |
To better understand how pass@k metric works, we will illustrate it with some examples. We select two problems from the HumanEval dataset and see how the model performs and which code completions pass the unit tests. We will use CodeParrot 🦜 (110M) with the two problems below:
|
|
|
21 |
|
22 |
```python
|
23 |
|
@@ -33,7 +33,7 @@ def separate_paren_groups(paren_string: str) -> List[str]:
|
|
33 |
['()', '(())', '(()())']
|
34 |
"""
|
35 |
````
|
36 |
-
|
37 |
```python
|
38 |
|
39 |
def truncate_number(number: float) -> float:
|
|
|
16 |
|GPT-neo (1.5B)| 4.79% | 7.47% | 16.30% |
|
17 |
|GPT-J (6B)| 11.62% | 15.74% | 27.74% |
|
18 |
|
|
|
19 |
To better understand how pass@k metric works, we will illustrate it with some examples. We select two problems from the HumanEval dataset and see how the model performs and which code completions pass the unit tests. We will use CodeParrot 🦜 (110M) with the two problems below:
|
20 |
+
### Problem 1:
|
21 |
|
22 |
```python
|
23 |
|
|
|
33 |
['()', '(())', '(()())']
|
34 |
"""
|
35 |
````
|
36 |
+
### Problem 1:
|
37 |
```python
|
38 |
|
39 |
def truncate_number(number: float) -> float:
|