Text Generation
Transformers
Safetensors
llama
text-generation-inference
Inference Endpoints

Details about expansion

#9
by icoderzqliu - opened

Hello, you mentioned 'After width expansion, there was a significant decline in the model's performance' in the blog, I would like to know some details about the width expansion, is it achieved by expanding the dimensions of the hidden layer? Or what method? Thank you!

Inspiration may be drawn from the insights presented in these two articles https://arxiv.org/abs/2112.11446, https://arxiv.org/abs/2110.07143

itsliupeng changed discussion status to closed

Sign up or log in to comment