A collection of ConvLLaVA models.
-
ConvLLaVA: Hierarchical Backbones as Visual Encoder for Large Multimodal Models
Paper • 2405.15738 • Published • 47 -
ConvLLaVA/ConvLLaVA-sft-768
Text Generation • Updated • 105 • 1 -
ConvLLaVA/ConvLLaVA-sft-1024
Text Generation • Updated • 46 -
ConvLLaVA/ConvLLaVA-sft-1536
Text Generation • Updated • 105