Spaces:
Sleeping
Sleeping
Commit
·
69f9684
1
Parent(s):
dd8cb56
Improved grader prompt
Browse files- tests/testing_prompts.py +11 -0
tests/testing_prompts.py
CHANGED
@@ -31,6 +31,9 @@ You should evaluate the following aspects and return a JSON with these keys:
|
|
31 |
"problem_statement_solvability": "The problem could be solved within the allotted 30-minute time frame.",
|
32 |
"problem_statement_relevance": "The problem was pertinent to the specific type of interview.",
|
33 |
"problem_statement_mistakes": "The problem statement contained no errors or inaccuracies (e.g., in provided examples).",
|
|
|
|
|
|
|
34 |
|
35 |
"interviewer_solution": "The interviewer didn't provide the solutions and avoided offering unnecessary hints during the interview.",
|
36 |
"interviewer_mistakes": "The interviewer didn't make any errors in code, computation, or logical reasoning.",
|
@@ -56,9 +59,17 @@ You should evaluate the following aspects and return a JSON with these keys:
|
|
56 |
"feedback_solution": "The feedback included the correct solution if the candidate was unable to solve the problem.",
|
57 |
"feedback_result": "The feedback accurately reflected the candidate's performance.",
|
58 |
"feedback_hallucinations": "The feedback didn't contain any non-relevant information.",
|
|
|
|
|
|
|
59 |
|
60 |
"comments": "Provide examples of mistakes made by the interviewer or areas for improvement, if there are some. List only bad things, don't list good. Keep it very short, or even empty"
|
61 |
|
|
|
|
|
|
|
|
|
|
|
62 |
Return just True, False, or None (if no info was provided) for each key except "comments", "comments" is string.
|
63 |
True is always a positive score, False is negative.
|
64 |
Keep comments empty if there are not huge mistakes or issues.
|
|
|
31 |
"problem_statement_solvability": "The problem could be solved within the allotted 30-minute time frame.",
|
32 |
"problem_statement_relevance": "The problem was pertinent to the specific type of interview.",
|
33 |
"problem_statement_mistakes": "The problem statement contained no errors or inaccuracies (e.g., in provided examples).",
|
34 |
+
"problem_statement_solution": "The problem statement doesn't leak an expected solution.",
|
35 |
+
"problem_statement_hints": "The problem statement doesn't give big hints regarding the solution.",
|
36 |
+
"problem_statement_answer_plan": "The problem statement doesn't contain the expected for the answer.",
|
37 |
|
38 |
"interviewer_solution": "The interviewer didn't provide the solutions and avoided offering unnecessary hints during the interview.",
|
39 |
"interviewer_mistakes": "The interviewer didn't make any errors in code, computation, or logical reasoning.",
|
|
|
59 |
"feedback_solution": "The feedback included the correct solution if the candidate was unable to solve the problem.",
|
60 |
"feedback_result": "The feedback accurately reflected the candidate's performance.",
|
61 |
"feedback_hallucinations": "The feedback didn't contain any non-relevant information.",
|
62 |
+
"feedback_focus": "The feedback was concise and didn't contain too many general comments.",
|
63 |
+
"feedback_completeness": "The feedback covered all important aspects (inc. mistakes) of the candidate performance.",
|
64 |
+
"feedback_examples": "The feedback illustrated all main point with specific examples from the interview.",
|
65 |
|
66 |
"comments": "Provide examples of mistakes made by the interviewer or areas for improvement, if there are some. List only bad things, don't list good. Keep it very short, or even empty"
|
67 |
|
68 |
+
|
69 |
+
All keys starting from "problem_" should evaluate only the initial part of the interview - the problem generation.
|
70 |
+
All keys starting from "interviewer_" should evaluate only the transcript of the interview - when the interviewer was communicating withe the candidate.
|
71 |
+
All keys starting from "feedback_" should evaluate only the last part of the interview - the feedback provided to the candidate in the very end.
|
72 |
+
|
73 |
Return just True, False, or None (if no info was provided) for each key except "comments", "comments" is string.
|
74 |
True is always a positive score, False is negative.
|
75 |
Keep comments empty if there are not huge mistakes or issues.
|