Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2410.13166

Papers - Training - Dataset Selection - Spectrogram Features

An Evolved Universal Transformer Memory

Paper • 2410.13166 • Published Oct 17, 2024 • 3

Papers - Training - Backward Masking

An Evolved Universal Transformer Memory

Paper • 2410.13166 • Published Oct 17, 2024 • 3

Papers - Training - Feature Extraction - Frequency - STFT

An Evolved Universal Transformer Memory

Paper • 2410.13166 • Published Oct 17, 2024 • 3

Papers - KV Cache - Spectrogram

An Evolved Universal Transformer Memory

Paper • 2410.13166 • Published Oct 17, 2024 • 3

Papers - Attention - Spectrogram - KV Cache

An Evolved Universal Transformer Memory

Paper • 2410.13166 • Published Oct 17, 2024 • 3

Perception and abstraction. Each modality is tokenized and embedded into vectors for model to comprehend.

VILA^2: VILA Augmented VILA

Paper • 2407.17453 • Published Jul 24, 2024 • 40
Octopus v4: Graph of language models

Paper • 2404.19296 • Published Apr 30, 2024 • 117
Octo-planner: On-device Language Model for Planner-Action Agents

Paper • 2406.18082 • Published Jun 26, 2024 • 48
Dolphin: Long Context as a New Modality for Energy-Efficient On-Device Language Models

Paper • 2408.15518 • Published Aug 28, 2024 • 43

Papers - Llama 3 - Fine-tuning

How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study

Paper • 2404.14047 • Published Apr 22, 2024 • 45
LiteSearch: Efficacious Tree Search for LLM

Paper • 2407.00320 • Published Jun 29, 2024 • 38
Cut Your Losses in Large-Vocabulary Language Models

Paper • 2411.09009 • Published Nov 13, 2024 • 44
LLaMA-Mesh: Unifying 3D Mesh Generation with Language Models

Paper • 2411.09595 • Published Nov 14, 2024 • 72

Papers - KV Cache

TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding

Paper • 2404.11912 • Published Apr 18, 2024 • 17
SnapKV: LLM Knows What You are Looking for Before Generation

Paper • 2404.14469 • Published Apr 22, 2024 • 24
LLM in a flash: Efficient Large Language Model Inference with Limited Memory

Paper • 2312.11514 • Published Dec 12, 2023 • 258
An Evolved Universal Transformer Memory

Paper • 2410.13166 • Published Oct 17, 2024 • 3

Papers - Training - Long Context

Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention

Paper • 2404.07143 • Published Apr 10, 2024 • 106
RULER: What's the Real Context Size of Your Long-Context Language Models?

Paper • 2404.06654 • Published Apr 9, 2024 • 35
An Evolved Universal Transformer Memory

Paper • 2410.13166 • Published Oct 17, 2024 • 3

Papers - Custom Layers - MLP

MLP Can Be A Good Transformer Learner

Paper • 2404.05657 • Published Apr 8, 2024 • 1
Toward a Better Understanding of Fourier Neural Operators: Analysis and Improvement from a Spectral Perspective

Paper • 2404.07200 • Published Apr 10, 2024 • 1
An inclusive review on deep learning techniques and their scope in handwriting recognition

Paper • 2404.08011 • Published Apr 10, 2024 • 1
Long-form music generation with latent diffusion

Paper • 2404.10301 • Published Apr 16, 2024 • 25

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs