microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition β’ Updated about 16 hours ago β’ 1.01M β’ 1.28k
Running 543 543 Vision Arena (Testing VLMs side-by-side) πΌ Analyze images to detect and label objects