Collections

3

Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models

Paper • 2410.02740 • Published Oct 3, 2024 • 52
From Code to Correctness: Closing the Last Mile of Code Generation with Hierarchical Debugging

Paper • 2410.01215 • Published Oct 2, 2024 • 30
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models

Paper • 2409.17146 • Published Sep 25, 2024 • 106
EuroLLM: Multilingual Language Models for Europe

Paper • 2409.16235 • Published Sep 24, 2024 • 26

-

4

Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models

From Code to Correctness: Closing the Last Mile of Code Generation with Hierarchical Debugging

Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models

EuroLLM: Multilingual Language Models for Europe

stepfun-ai/GOT-OCR2_0

Midi Music Generator

OpenGVLab/InternVL2_5-78B-MPO

OpenGVLab/InternVL2_5-38B-MPO-AWQ

VILA^2: VILA Augmented VILA

Octopus v4: Graph of language models

Octo-planner: On-device Language Model for Planner-Action Agents

Dolphin: Long Context as a New Modality for Energy-Efficient On-Device Language Models

Human-like Episodic Memory for Infinite Context LLMs

MUSCLE: A Model Update Strategy for Compatible LLM Evolution

Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled Refusal Training

ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities

Symbolic Learning Enables Self-Evolving Agents

Agent Laboratory: Using LLM Agents as Research Assistants

MotionLLM: Understanding Human Behaviors from Human Motions and Videos

Spectrally Pruned Gaussian Fields with Neural Compensation

Paint by Inpaint: Learning to Add Image Objects by Removing Them First

LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report

More Agents Is All You Need

OS-Copilot: Towards Generalist Computer Agents with Self-Improvement

Generative Agents: Interactive Simulacra of Human Behavior

Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models

AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning

AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent

Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts

Parrot: Efficient Serving of LLM-based Applications with Semantic Variable

LoRA+: Efficient Low Rank Adaptation of Large Models

The FinBen: An Holistic Financial Benchmark for Large Language Models

TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization

TrustLLM: Trustworthiness in Large Language Models

DocGraphLM: Documental Graph Language Model for Information Extraction

Understanding LLMs: A Comprehensive Overview from Training to Inference

DocLLM: A layout-aware generative language model for multimodal document understanding

Attention Where It Matters: Rethinking Visual Document Understanding with Selective Region Concentration