How does this model compare to e.g. gpt2?
#1
by
julien-c
HF staff
- opened
In generation powers
The MBZUAI team have a nice repo where they showcase and evaluate performance across a set of architectures:
As their project name ("LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions") suggests, they train these models on output from gpt-3.5-turbo (see data section of README)
The original (python) model can be found here.