PixMo Collection A set of vision-language datasets built by Ai2 and used to train the Molmo family of models. Read more at https://molmo.allenai.org/blog • 10 items • Updated 26 days ago • 68
microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • Updated 12 days ago • 1.08M • 1.28k