google/switch-base-256
Text2Text Generation
•
Updated
•
216
•
4
This release included various MoE (Mixture of expert) models, based on the T5 architecture . The base models use from 8 to 256 experts.