Sparse Mixture of Experts Language Model from Scratch: Extending makeMoE with Expert Capacity Mar 18 • 8
microsoft/BiomedCLIP-PubMedBERT_256-vit_base_patch16_224 Zero-Shot Image Classification • Updated Sep 27 • 119k • 239