Patrick Haller's picture

Patrick Haller PRO

PatrickHaller

·

HallerPatrick

AI & ML interests

NLP, Language Models, Autoregressive Models

Recent Activity

upvoted a collection 16 days ago

Kimi-Linear-A3B

authored a paper about 1 month ago

Repetition over Diversity: High-Signal Data Filtering for Sample-Efficient German Language Modeling

upvoted a paper about 1 month ago

Repetition over Diversity: High-Signal Data Filtering for Sample-Efficient German Language Modeling

View all activity

Organizations

Posts 1

Post

2168

How Robust Is Your Model in Complex Code Generation Tasks? 🤔

We've launched the PECC benchmark to challenge chat models in code generation, drawing from the Advent of Code for programming tasks and the Euler Project for math-heavy challenges. This new task tests models with problems presented in both detailed prose and concise "leet code" styles, evaluating their ability to understand and solve complex coding issues and math problem in chat-based interactions.

It seems that the Claude 3 models outperforme ChatGPT:
Model / Avg. (pass@3)
Claude 3 Haiku / 27.67
GPT-3.5-Turbo / 23.75
Mixtral-8x22B-Instruct-v0.1 / 8.35

Read our Preprint📃: PECC: Problem Extraction and Coding Challenges (2404.18766)
Look at the dataset🔎: PatrickHaller/pecc

We also got accepted at LREC-COLING '24 🎉

Collections 5

View 5 collections

Papers 11

arxiv:2604.28075

arxiv:2511.05560

arxiv:2504.14366

arxiv:2503.05891

spaces 1

Pecc Leaderboard

models 28

PatrickHaller/gdn-midtraining-sft

Text Generation • 2B • Updated Jan 27 • 3

PatrickHaller/gla-350M-10B

Text Generation • 0.4B • Updated Aug 19, 2025 • 1

PatrickHaller/babylm_2025_submission_strict-small2

0.3B • Updated Aug 16, 2025 • 3.69k

PatrickHaller/babylm_2025_submission_strict

0.3B • Updated Aug 15, 2025 • 3.72k

PatrickHaller/snowflake-arctic-embed-m-v2.0

Sentence Similarity • 0.3B • Updated Jul 10, 2025 • 56

PatrickHaller/hgrn2_pile_10M_distill_babylm

Updated Dec 18, 2024 • 17

PatrickHaller/hgrn2_pile_100m_distill_babylm

Text Generation • Updated Dec 17, 2024 • 24 • 1

PatrickHaller/babylm_transformer_strict_small_comparison

Text Generation • 0.4B • Updated Oct 9, 2024 • 8

PatrickHaller/hgrn2_de_wiki

Text Generation • Updated Sep 30, 2024 • 7

PatrickHaller/xlstm_pile_10m

1B • Updated Sep 17, 2024 • 7 • 1

datasets 25

PatrickHaller/BabyLM2025-Strict-Dataset

Viewer • Updated Aug 15, 2025 • 1.64M • 8

PatrickHaller/BabyLM2025-Strict-Small-Dataset

Viewer • Updated Aug 15, 2025 • 255k • 14

PatrickHaller/fineweb-edu-plus

Viewer • Updated Jul 16, 2025 • 1.93M • 38

PatrickHaller/fineweb-10B

Viewer • Updated Jun 14, 2025 • 14.2M • 74

PatrickHaller/fineweb-1B

Viewer • Updated May 28, 2025 • 1.38M • 67 • 1

PatrickHaller/blimp_synth

Viewer • Updated May 27, 2025 • 75.8k • 42

PatrickHaller/fineweb-2-de-10B

Viewer • Updated May 27, 2025 • 15.3M • 997

PatrickHaller/fineweb-3B

Viewer • Updated May 21, 2025 • 4.07M • 45 • 2

PatrickHaller/fineweb-edu-3B

Viewer • Updated Feb 27, 2025 • 2.87M • 771

PatrickHaller/fineweb-2-de-3B

Updated Feb 19, 2025 • 2

View 25 datasets