IliaLarchenko commited on
Commit
69f9684
·
1 Parent(s): dd8cb56

Improved grader prompt

Browse files
Files changed (1) hide show
  1. tests/testing_prompts.py +11 -0
tests/testing_prompts.py CHANGED
@@ -31,6 +31,9 @@ You should evaluate the following aspects and return a JSON with these keys:
31
  "problem_statement_solvability": "The problem could be solved within the allotted 30-minute time frame.",
32
  "problem_statement_relevance": "The problem was pertinent to the specific type of interview.",
33
  "problem_statement_mistakes": "The problem statement contained no errors or inaccuracies (e.g., in provided examples).",
 
 
 
34
 
35
  "interviewer_solution": "The interviewer didn't provide the solutions and avoided offering unnecessary hints during the interview.",
36
  "interviewer_mistakes": "The interviewer didn't make any errors in code, computation, or logical reasoning.",
@@ -56,9 +59,17 @@ You should evaluate the following aspects and return a JSON with these keys:
56
  "feedback_solution": "The feedback included the correct solution if the candidate was unable to solve the problem.",
57
  "feedback_result": "The feedback accurately reflected the candidate's performance.",
58
  "feedback_hallucinations": "The feedback didn't contain any non-relevant information.",
 
 
 
59
 
60
  "comments": "Provide examples of mistakes made by the interviewer or areas for improvement, if there are some. List only bad things, don't list good. Keep it very short, or even empty"
61
 
 
 
 
 
 
62
  Return just True, False, or None (if no info was provided) for each key except "comments", "comments" is string.
63
  True is always a positive score, False is negative.
64
  Keep comments empty if there are not huge mistakes or issues.
 
31
  "problem_statement_solvability": "The problem could be solved within the allotted 30-minute time frame.",
32
  "problem_statement_relevance": "The problem was pertinent to the specific type of interview.",
33
  "problem_statement_mistakes": "The problem statement contained no errors or inaccuracies (e.g., in provided examples).",
34
+ "problem_statement_solution": "The problem statement doesn't leak an expected solution.",
35
+ "problem_statement_hints": "The problem statement doesn't give big hints regarding the solution.",
36
+ "problem_statement_answer_plan": "The problem statement doesn't contain the expected for the answer.",
37
 
38
  "interviewer_solution": "The interviewer didn't provide the solutions and avoided offering unnecessary hints during the interview.",
39
  "interviewer_mistakes": "The interviewer didn't make any errors in code, computation, or logical reasoning.",
 
59
  "feedback_solution": "The feedback included the correct solution if the candidate was unable to solve the problem.",
60
  "feedback_result": "The feedback accurately reflected the candidate's performance.",
61
  "feedback_hallucinations": "The feedback didn't contain any non-relevant information.",
62
+ "feedback_focus": "The feedback was concise and didn't contain too many general comments.",
63
+ "feedback_completeness": "The feedback covered all important aspects (inc. mistakes) of the candidate performance.",
64
+ "feedback_examples": "The feedback illustrated all main point with specific examples from the interview.",
65
 
66
  "comments": "Provide examples of mistakes made by the interviewer or areas for improvement, if there are some. List only bad things, don't list good. Keep it very short, or even empty"
67
 
68
+
69
+ All keys starting from "problem_" should evaluate only the initial part of the interview - the problem generation.
70
+ All keys starting from "interviewer_" should evaluate only the transcript of the interview - when the interviewer was communicating withe the candidate.
71
+ All keys starting from "feedback_" should evaluate only the last part of the interview - the feedback provided to the candidate in the very end.
72
+
73
  Return just True, False, or None (if no info was provided) for each key except "comments", "comments" is string.
74
  True is always a positive score, False is negative.
75
  Keep comments empty if there are not huge mistakes or issues.