Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2503.18908

Perception and abstraction. Each modality is tokenized and embedded into vectors for model to comprehend.

VILA^2: VILA Augmented VILA

Paper • 2407.17453 • Published Jul 24, 2024 • 41
Octopus v4: Graph of language models

Paper • 2404.19296 • Published Apr 30, 2024 • 118
Octo-planner: On-device Language Model for Planner-Action Agents

Paper • 2406.18082 • Published Jun 26, 2024 • 48
Dolphin: Long Context as a New Modality for Energy-Efficient On-Device Language Models

Paper • 2408.15518 • Published Aug 28, 2024 • 43

attention parallel

FFN Fusion: Rethinking Sequential Computation in Large Language Models

Paper • 2503.18908 • Published 10 days ago • 17

FFN Fusion: Rethinking Sequential Computation in Large Language Models

Paper • 2503.18908 • Published 10 days ago • 17

Papers to review

Just an EZ way to collect papers on HF

about 19 hours ago

Reducing Transformer Key-Value Cache Size with Cross-Layer Attention

Paper • 2405.12981 • Published May 21, 2024 • 32
TinyR1-32B-Preview: Boosting Accuracy with Branch-Merge Distillation

Paper • 2503.04872 • Published 28 days ago • 14
FFN Fusion: Rethinking Sequential Computation in Large Language Models

Paper • 2503.18908 • Published 10 days ago • 17

Show-o: One Single Transformer to Unify Multimodal Understanding and Generation

Paper • 2408.12528 • Published Aug 22, 2024 • 51
FFN Fusion: Rethinking Sequential Computation in Large Language Models

Paper • 2503.18908 • Published 10 days ago • 17

Graph insights with Ai and LLM

ControlLLM: Augment Language Models with Tools by Searching on Graphs

Paper • 2310.17796 • Published Oct 26, 2023 • 18
Fast Chain-of-Thought: A Glance of Future from Parallel Decoding Leads to Answers Faster

Paper • 2311.08263 • Published Nov 14, 2023 • 16
Kimi k1.5: Scaling Reinforcement Learning with LLMs

Paper • 2501.12599 • Published Jan 22 • 112
ARR: Question Answering with Large Language Models via Analyzing, Retrieving, and Reasoning

Paper • 2502.04689 • Published Feb 7 • 7

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs