Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2411.17116

Pending Classification

about 3 hours ago

Video Creation by Demonstration

Paper • 2412.09551 • Published 13 days ago • 8
DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation

Paper • 2412.07589 • Published 16 days ago • 45
Unraveling the Complexity of Memory in RL Agents: an Approach for Classification and Evaluation

Paper • 2412.06531 • Published 17 days ago • 71
APOLLO: SGD-like Memory, AdamW-level Performance

Paper • 2412.05270 • Published 19 days ago • 38

efficient inference

Star Attention: Efficient LLM Inference over Long Sequences

Paper • 2411.17116 • Published about 1 month ago • 47

ShowAndTell-2024-12-03

Beyond Examples: High-level Automated Reasoning Paradigm in In-Context Learning via MCTS

Paper • 2411.18478 • Published 29 days ago • 32
o1-Coder: an o1 Replication for Coding

Paper • 2412.00154 • Published 27 days ago • 40
A Simple and Provable Scaling Law for the Test-Time Compute of Large Language Models

Paper • 2411.19477 • Published 27 days ago • 5
Reverse Thinking Makes LLMs Stronger Reasoners

Paper • 2411.19865 • Published 27 days ago • 19

Star Attention: Efficient LLM Inference over Long Sequences

Paper • 2411.17116 • Published about 1 month ago • 47

Star Attention: Efficient LLM Inference over Long Sequences

Paper • 2411.17116 • Published about 1 month ago • 47

Star Attention: Efficient LLM Inference over Long Sequences

Paper • 2411.17116 • Published about 1 month ago • 47

Natural Language Reinforcement Learning

Paper • 2411.14251 • Published Nov 21 • 26
The Impossible Test: A 2024 Unsolvable Dataset and A Chance for an AGI Quiz

Paper • 2411.14486 • Published Nov 20 • 7
Star Attention: Efficient LLM Inference over Long Sequences

Paper • 2411.17116 • Published about 1 month ago • 47
Beyond Examples: High-level Automated Reasoning Paradigm in In-Context Learning via MCTS

Paper • 2411.18478 • Published 29 days ago • 32

about 18 hours ago

Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering

Paper • 2411.11504 • Published Nov 18 • 19
Top-nσ: Not All Logits Are You Need

Paper • 2411.07641 • Published Nov 12 • 18
Adaptive Decoding via Latent Preference Optimization

Paper • 2411.09661 • Published Nov 14 • 10
When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training

Paper • 2411.13476 • Published Nov 20 • 15

Selective Attention Improves Transformer

Paper • 2410.02703 • Published Oct 3 • 23
Differential Transformer

Paper • 2410.05258 • Published Oct 7 • 168
TidalDecode: Fast and Accurate LLM Decoding with Position Persistent Sparse Attention

Paper • 2410.05076 • Published Oct 7 • 7
SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs

Paper • 2410.13276 • Published Oct 17 • 25

Interesting Papers

These papers are interesting (to me)

Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models

Paper • 2410.02740 • Published Oct 3 • 52
From Code to Correctness: Closing the Last Mile of Code Generation with Hierarchical Debugging

Paper • 2410.01215 • Published Oct 2 • 30
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models

Paper • 2409.17146 • Published Sep 25 • 104
EuroLLM: Multilingual Language Models for Europe

Paper • 2409.16235 • Published Sep 24 • 25

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs