OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on E-Branchformer Paper • 2401.16658 • Published Jan 30 • 13
LongAlign: A Recipe for Long Context Alignment of Large Language Models Paper • 2401.18058 • Published Jan 31 • 21
From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities Paper • 2401.15071 • Published Jan 26 • 34
Deconstructing Denoising Diffusion Models for Self-Supervised Learning Paper • 2401.14404 • Published Jan 25 • 16
Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities Paper • 2401.14405 • Published Jan 25 • 11
MM-LLMs: Recent Advances in MultiModal Large Language Models Paper • 2401.13601 • Published Jan 24 • 44
Unitxt: Flexible, Shareable and Reusable Data Preparation and Evaluation for Generative AI Paper • 2401.14019 • Published Jan 25 • 19