Collections
Discover the best community collections!
Collections including paper arxiv:2410.13166
-
VILA^2: VILA Augmented VILA
Paper • 2407.17453 • Published • 40 -
Octopus v4: Graph of language models
Paper • 2404.19296 • Published • 117 -
Octo-planner: On-device Language Model for Planner-Action Agents
Paper • 2406.18082 • Published • 48 -
Dolphin: Long Context as a New Modality for Energy-Efficient On-Device Language Models
Paper • 2408.15518 • Published • 43
-
How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study
Paper • 2404.14047 • Published • 45 -
LiteSearch: Efficacious Tree Search for LLM
Paper • 2407.00320 • Published • 38 -
Cut Your Losses in Large-Vocabulary Language Models
Paper • 2411.09009 • Published • 44 -
LLaMA-Mesh: Unifying 3D Mesh Generation with Language Models
Paper • 2411.09595 • Published • 72
-
TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding
Paper • 2404.11912 • Published • 17 -
SnapKV: LLM Knows What You are Looking for Before Generation
Paper • 2404.14469 • Published • 24 -
LLM in a flash: Efficient Large Language Model Inference with Limited Memory
Paper • 2312.11514 • Published • 258 -
An Evolved Universal Transformer Memory
Paper • 2410.13166 • Published • 3
-
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention
Paper • 2404.07143 • Published • 106 -
RULER: What's the Real Context Size of Your Long-Context Language Models?
Paper • 2404.06654 • Published • 35 -
An Evolved Universal Transformer Memory
Paper • 2410.13166 • Published • 3
-
MLP Can Be A Good Transformer Learner
Paper • 2404.05657 • Published • 1 -
Toward a Better Understanding of Fourier Neural Operators: Analysis and Improvement from a Spectral Perspective
Paper • 2404.07200 • Published • 1 -
An inclusive review on deep learning techniques and their scope in handwriting recognition
Paper • 2404.08011 • Published • 1 -
Long-form music generation with latent diffusion
Paper • 2404.10301 • Published • 25