Elijah Wilt

ooj

AI & ML interests

None yet

Recent Activity

liked a model 14 days ago

unsloth/DeepSeek-R1-GGUF

liked a model 14 days ago

deepseek-ai/DeepSeek-R1

upvoted a paper 15 days ago

LLaMA-Mesh: Unifying 3D Mesh Generation with Language Models

View all activity

Organizations

None yet

ooj's activity

upvoted 13 papers 15 days ago

LLaMA-Mesh: Unifying 3D Mesh Generation with Language Models

Paper • 2411.09595 • Published Nov 14, 2024 • 72

Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 89

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published Dec 18, 2024 • 125

Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though

Paper • 2501.04682 • Published 26 days ago • 90

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published 27 days ago • 252

Entropy-Guided Attention for Private LLMs

Paper • 2501.03489 • Published 28 days ago • 14

The GAN is dead; long live the GAN! A Modern GAN Baseline

Paper • 2501.05441 • Published 25 days ago • 87

VideoRAG: Retrieval-Augmented Generation over Video Corpus

Paper • 2501.05874 • Published 25 days ago • 66

Tensor Product Attention Is All You Need

Paper • 2501.06425 • Published 24 days ago • 80

Evolving Deeper LLM Thinking

Paper • 2501.09891 • Published 18 days ago • 104

upvoted a collection 16 days ago

Qwen2.5-Math

Collection

Math-specific model series based on Qwen2.5 • 11 items • Updated 21 days ago • 67

upvoted 5 papers 16 days ago

The Lessons of Developing Process Reward Models in Mathematical Reasoning

Paper • 2501.07301 • Published 22 days ago • 89

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published 20 days ago • 271

Trusted Machine Learning Models Unlock Private Inference for Problems Currently Infeasible with Cryptography

Paper • 2501.08970 • Published 19 days ago • 6

Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps

Paper • 2501.09732 • Published 18 days ago • 67

The Heap: A Contamination-Free Multilingual Code Dataset for Evaluating Large Language Models

Paper • 2501.09653 • Published 18 days ago • 12

upvoted a paper 7 months ago

Octopus v4: Graph of language models

Paper • 2404.19296 • Published Apr 30, 2024 • 117