5 49 23

Minghui Jia

Maxwell-Jia

Maxwell-Jia

AI & ML interests

None yet

Recent Activity

upvoted a paper 24 days ago

Recitation over Reasoning: How Cutting-Edge Language Models Can Fail on Elementary School-Level Reasoning Problems?

updated a collection 24 days ago

Daily arXiv

updated a collection 25 days ago

Daily arXiv

View all activity

Organizations

Maxwell-Jia's activity

upvoted a paper 24 days ago

Recitation over Reasoning: How Cutting-Edge Language Models Can Fail on Elementary School-Level Reasoning Problems?

Paper • 2504.00509 • Published 25 days ago • 21

upvoted an article about 1 month ago

Article

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

Mar 12

• 400

upvoted a paper about 1 month ago

Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models

Paper • 2503.16419 • Published Mar 20 • 72

upvoted a paper about 2 months ago

Chain of Draft: Thinking Faster by Writing Less

Paper • 2502.18600 • Published Feb 25 • 48

upvoted 4 papers 2 months ago

upvoted 2 papers 3 months ago

Humanity's Last Exam

Paper • 2501.14249 • Published Jan 24 • 73

O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning

Paper • 2501.12570 • Published Jan 22 • 28

upvoted a collection 3 months ago

DeepSeek-R1

Collection

8 items • Updated Jan 21 • 618

upvoted a paper 3 months ago

Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models

Paper • 2501.09686 • Published Jan 16 • 41

upvoted 2 papers 4 months ago

Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs

Paper • 2412.21187 • Published Dec 30, 2024 • 42

No More Adam: Learning Rate Scaling at Initialization is All You Need

Paper • 2412.11768 • Published Dec 16, 2024 • 44

upvoted a collection 4 months ago

LLaMA-O1-1129 Datasets, Models, Codes and Papers

Collection

8 items • Updated Dec 3, 2024 • 18

upvoted 3 papers 4 months ago

Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 101

Phi-4 Technical Report

Paper • 2412.08905 • Published Dec 12, 2024 • 116

InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions

Paper • 2412.09596 • Published Dec 12, 2024 • 99

upvoted 2 papers 5 months ago

ClinicalBench: Can LLMs Beat Traditional ML Models in Clinical Prediction?

Paper • 2411.06469 • Published Nov 10, 2024 • 17

Cut Your Losses in Large-Vocabulary Language Models

Paper • 2411.09009 • Published Nov 13, 2024 • 50