JARVIS-1: Open-World Multi-task Agents with Memory-Augmented Multimodal Language Models Paper • 2311.05997 • Published Nov 10, 2023 • 36
MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning Paper • 2310.09478 • Published Oct 14, 2023 • 19
Adaptive Frequency Filters As Efficient Global Token Mixers Paper • 2307.14008 • Published Jul 26, 2023 • 4