Do you compare the Jamba Mini 1.6 (12B active/52B total) with the 7~8B model?
#3
by
win10
- opened
Do you compare the Jamba Mini 1.6 (12B active/52B total) with the 7~8B model?
Don't you think it's stupid?
Architecture is completely different and 12B activated parameters vs 8B activated parameters. Not that big of a gap.
Is this a 52b MoE model, with the active 12b portion being the experts?