University of Illinois at Urbana-Champaign

university

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

emrecanacikgoz authored a paper 4 days ago

Can a Single Model Master Both Multi-turn Conversations and Tool Use? CALM: A Unified Conversational Agentic Language Model

Minjia authored a paper 20 days ago

Model Tells You What to Discard: Adaptive KV Cache Compression for LLMs

Minjia authored a paper 20 days ago

DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies

View all activity

UIUC-CS's activity

Minjia

authored 11 papers 20 days ago

Model Tells You What to Discard: Adaptive KV Cache Compression for LLMs

Paper • 2310.01801 • Published Oct 3, 2023 • 3

DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies

Paper • 2310.04610 • Published Oct 6, 2023 • 1

ZeRO-Offload: Democratizing Billion-Scale Model Training

Paper • 2101.06840 • Published Jan 18, 2021

Computing in the Era of Large Generative Models: From Cloud-Native to AI-Native

Paper • 2401.12230 • Published Jan 17, 2024 • 1

DeepSpeed-Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-like Models at All Scales

Paper • 2308.01320 • Published Aug 2, 2023 • 45

DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale

Paper • 2201.05596 • Published Jan 14, 2022 • 2

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Paper • 2211.05100 • Published Nov 9, 2022 • 29

Random-LTD: Random and Layerwise Token Dropping Brings Efficient Training for Large-scale Transformers

Paper • 2211.11586 • Published Nov 17, 2022 • 1

Universal Checkpointing: Efficient and Flexible Checkpointing for Large Scale Distributed Training

Paper • 2406.18820 • Published Jun 27, 2024

Model Tells You Where to Merge: Adaptive KV Cache Merging for LLMs on Long-Context Tasks

Paper • 2407.08454 • Published Jul 11, 2024

DeepSpeed Inference: Enabling Efficient Inference of Transformer Models at Unprecedented Scale

Paper • 2207.00032 • Published Jun 30, 2022

Minjia

authored a paper about 1 month ago

Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey

Paper • 2412.18619 • Published Dec 16, 2024 • 55

yanQval

authored a paper 2 months ago

Context-Aware Sparse Deep Coordination Graphs

Paper • 2106.02886 • Published Jun 5, 2021

shaoweiliu

authored a paper 5 months ago

PhysGen: Rigid-Body Physics-Grounded Image-to-Video Generation

Paper • 2409.18964 • Published Sep 27, 2024 • 26

Jessemel

authored 2 papers 5 months ago

SingleInsert: Inserting New Concepts from a Single Image into Text-to-Image Models for Flexible Editing

Paper • 2310.08094 • Published Oct 12, 2023 • 1

CCPL: Contrastive Coherence Preserving Loss for Versatile Style Transfer

Paper • 2207.04808 • Published Jul 11, 2022

Minjia

authored a paper 8 months ago

UltraEdit: Instruction-based Fine-Grained Image Editing at Scale

Paper • 2407.05282 • Published Jul 7, 2024 • 13

MatouK98

authored a paper 8 months ago

Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning

Paper • 2407.00617 • Published Jun 30, 2024 • 7

MatouK98

authored a paper 9 months ago

Offline Learning in Markov Games with General Function Approximation

Paper • 2302.02571 • Published Feb 6, 2023

Deema

authored a paper about 1 year ago

CIDAR: Culturally Relevant Instruction Dataset For Arabic

Paper • 2402.03177 • Published Feb 5, 2024 • 6