Collections
Discover the best community collections!
Collections including paper arxiv:2412.15115
-
GenEx: Generating an Explorable World
Paper • 2412.09624 • Published • 89 -
IamCreateAI/Ruyi-Mini-7B
Image-to-Video • Updated • 6.21k • 585 -
Track4Gen: Teaching Video Diffusion Models to Track Points Improves Video Generation
Paper • 2412.06016 • Published • 20 -
Byte Latent Transformer: Patches Scale Better Than Tokens
Paper • 2412.09871 • Published • 89
-
Phi-4 Technical Report
Paper • 2412.08905 • Published • 106 -
Evaluating and Aligning CodeLLMs on Human Preference
Paper • 2412.05210 • Published • 47 -
Evaluating Language Models as Synthetic Data Generators
Paper • 2412.03679 • Published • 46 -
Yi-Lightning Technical Report
Paper • 2412.01253 • Published • 26
-
Training Large Language Models to Reason in a Continuous Latent Space
Paper • 2412.06769 • Published • 75 -
Byte Latent Transformer: Patches Scale Better Than Tokens
Paper • 2412.09871 • Published • 89 -
Qwen2.5 Technical Report
Paper • 2412.15115 • Published • 341 -
YuLan-Mini: An Open Data-efficient Language Model
Paper • 2412.17743 • Published • 64
-
LLM Teacher-Student Framework for Text Classification With No Manually Annotated Data: A Case Study in IPTC News Topic Classification
Paper • 2411.19638 • Published • 6 -
Word Sense Linking: Disambiguating Outside the Sandbox
Paper • 2412.09370 • Published • 9 -
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference
Paper • 2412.13663 • Published • 125 -
Qwen2.5 Technical Report
Paper • 2412.15115 • Published • 341
-
Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering
Paper • 2411.11504 • Published • 20 -
Top-nσ: Not All Logits Are You Need
Paper • 2411.07641 • Published • 20 -
Adaptive Decoding via Latent Preference Optimization
Paper • 2411.09661 • Published • 10 -
When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training
Paper • 2411.13476 • Published • 15
-
Qwen2.5 Technical Report
Paper • 2412.15115 • Published • 341 -
SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models
Paper • 2412.11605 • Published • 17 -
483📈
Scaling test-time compute
-
Reverse Thinking Makes LLMs Stronger Reasoners
Paper • 2411.19865 • Published • 20
-
Differential Transformer
Paper • 2410.05258 • Published • 169 -
PaliGemma 2: A Family of Versatile VLMs for Transfer
Paper • 2412.03555 • Published • 124 -
VisionZip: Longer is Better but Not Necessary in Vision Language Models
Paper • 2412.04467 • Published • 105 -
o1-Coder: an o1 Replication for Coding
Paper • 2412.00154 • Published • 43