--- license: bsd-3-clause language: - vi --- # PhoGPT: Generative Pre-training for Vietnamese We release a 7.5B-parameter generative model series named PhoGPT for Vietnamese, which includes the base pre-trained monolingual model **PhoGPT-7B5** and its instruction-following variant **PhoGPT-7B5-Instruct**. PhoGPT-7B5 is a Transformer decoder-based model that incorporates flash attention (Triton) and ALiBi for context length extrapolation. We pre-trained PhoGPT-7B5 from scratch on a 41GB pre-training corpus of Vietnamese texts. We then fine-tuned the pre-trained PhoGPT-7B5 for instruction following using a dataset of 150K Vietnamese prompt-response pairs, resulting in PhoGPT-7B5-Instruct. We recommend users use the latest PhoGPT versions - PhoGPT-4B and PhoGPT-4B-Instruct - for better performance and efficiency. For further information or requests, please go to [PhoGPT's homepage](https://github.com/VinAIResearch/PhoGPT)!