A new idea to improve training and inference performance
#82
by
lijip26313
- opened
Hello Google Team, I have an idea to significantly improve LLM performance: https://www.kaggle.com/code/vasilypodorov/fast-language-modelling-with-un-formers
There I have trained 0.7B parameter LLM of a new architecture with a throughput of approximately 0.7B tokens per hour on TPU v3-8. The details are in the article referenced above.
Could you read it and say if it makes sense? I would like you to try training small LLM based on this technique to decide whether it is useful or not. This would take just several TPU days.