codegen

community

AI & ML interests

None defined yet.

authored a paper over 1 year ago

PhD Knowledge Not Required: A Reasoning Challenge for Large Language Models

Paper • 2502.01584 • Published Feb 3, 2025 • 9

authored a paper over 1 year ago

Humanity's Last Exam

Paper • 2501.14249 • Published Jan 24, 2025 • 77

mohit-raghavendra

authored 2 papers over 1 year ago

Representation Learning in Continuous-Time Dynamic Signed Networks

Paper • 2207.03408 • Published Jul 7, 2022

Revisiting the Superficial Alignment Hypothesis

Paper • 2410.03717 • Published Sep 27, 2024

updated 4 datasets over 1 year ago

codegenning/B_i_filterio24_v3

Viewer • Updated Sep 27, 2024 • 409 • 16

codegenning/B_i_filterio24_v2

Viewer • Updated Sep 27, 2024 • 409 • 8

codegenning/B_i_filterio24

Viewer • Updated Sep 27, 2024 • 409 • 29

codegenning/B_filterio24

Viewer • Updated Sep 26, 2024 • 8 • 9

authored 4 papers over 1 year ago

Chain-of-Thought Reasoning is a Policy Improvement Operator

Paper • 2309.08589 • Published Sep 15, 2023 • 2

Q-Probe: A Lightweight Approach to Reward Maximization for Language Models

Paper • 2402.14688 • Published Feb 22, 2024

NATURAL PLAN: Benchmarking LLMs on Natural Language Planning

Paper • 2406.04520 • Published Jun 6, 2024 • 13

LLM Defenses Are Not Robust to Multi-Turn Human Jailbreaks Yet

Paper • 2408.15221 • Published Aug 27, 2024

updated 7 datasets over 1 year ago

codegenning/B_mbpp_plus_v2

Viewer • Updated Aug 22, 2024 • 756 • 20

codegenning/F_mbpp_plus

Viewer • Updated Aug 22, 2024 • 378 • 17

codegenning/B_human_eval_plus_v2

Viewer • Updated Aug 22, 2024 • 328 • 22

codegenning/B_human_eval_plus

Viewer • Updated Aug 22, 2024 • 328 • 18

codegenning/B_livecodebench_lite_v3_C

Viewer • Updated Aug 19, 2024 • 876 • 40

codegenning/B_livecodebench_lite_v3

Viewer • Updated Aug 19, 2024 • 348 • 38

codegenning/B_mbpp_plus

Viewer • Updated Aug 19, 2024 • 756 • 12

updated a dataset over 1 year ago

codegenning/B_livecodebench_C

Viewer • Updated Aug 16, 2024 • 174 • 33