metaeval

university

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

sileod authored a paper 1 day ago

Bridging the Data Provenance Gap Across Text, Speech and Video

sileod authored a paper 1 day ago

Saturation-Driven Dataset Generation for LLM Mathematical Reasoning in the TPTP Ecosystem

sileod authored a paper 1 day ago

Reasoning Core: A Scalable RL Environment for LLM Symbolic Reasoning

View all activity

authored 6 papers 1 day ago

Bridging the Data Provenance Gap Across Text, Speech and Video

Paper • 2412.17847 • Published Dec 19, 2024 • 11

Saturation-Driven Dataset Generation for LLM Mathematical Reasoning in the TPTP Ecosystem

Paper • 2509.06809 • Published Sep 8, 2025 • 3

Reasoning Core: A Scalable RL Environment for LLM Symbolic Reasoning

Paper • 2509.18083 • Published Sep 22, 2025 • 5

MortalMATH: Evaluating the Conflict Between Reasoning Objectives and Emergency Contexts

Paper • 2601.18790 • Published Jan 26 • 2

Adaptive Text Anonymization: Learning Privacy-Utility Trade-offs via Prompt Optimization

Paper • 2602.20743 • Published 8 days ago • 2

Reasoning Core: A Scalable Procedural Data Generation Suite for Symbolic Pre-training and Post-Training

Paper • 2603.02208 • Published 1 day ago • 4

submitted a paper to Daily Papers 1 day ago

Reasoning Core: A Scalable Procedural Data Generation Suite for Symbolic Pre-training and Post-Training

Paper • 2603.02208 • Published 1 day ago • 4

submitted a paper to Daily Papers about 1 month ago

MortalMATH: Evaluating the Conflict Between Reasoning Objectives and Emergency Contexts

Paper • 2601.18790 • Published Jan 26 • 2

authored a paper 6 months ago

Humanity's Last Exam

Paper • 2501.14249 • Published Jan 24, 2025 • 77

authored 10 papers over 1 year ago

TAROT: Task-Oriented Authorship Obfuscation Using Policy Optimization Methods

Paper • 2407.21630 • Published Jul 31, 2024 • 8

Consent in Crisis: The Rapid Decline of the AI Data Commons

Paper • 2407.14933 • Published Jul 20, 2024 • 14

Generating multiple-choice questions for medical question answering with distractors and cue-masking

Paper • 2303.07069 • Published Mar 13, 2023

Attention Overflow: Language Model Input Blur during Long-Context Missing Items Recommendation

Paper • 2407.13481 • Published Jul 18, 2024 • 10

Mining Discourse Markers for Unsupervised Sentence Representation Learning

Paper • 1903.11850 • Published Mar 28, 2019

tasksource: Structured Dataset Preprocessing Annotations for Frictionless Extreme Multi-Task Learning and Evaluation

Paper • 2301.05948 • Published Jan 14, 2023 • 3

MindGames: Targeting Theory of Mind in Large Language Models with Dynamic Epistemic Modal Logic

Paper • 2305.03353 • Published May 5, 2023

Probing neural language models for understanding of words of estimative probability

Paper • 2211.03358 • Published Nov 7, 2022 • 1

The Data Provenance Initiative: A Large Scale Audit of Dataset Licensing & Attribution in AI

Paper • 2310.16787 • Published Oct 25, 2023 • 5

Scaling Synthetic Logical Reasoning Datasets with Context-Sensitive Declarative Grammars

Paper • 2406.11035 • Published Jun 16, 2024 • 1

updated a dataset over 2 years ago

metaeval/syntactic-augmentation-nli

Viewer • Updated Jun 13, 2023 • 12.2k • 40 • 2