aisingapore
/

llama3.1-8b-cpt-sea-lionv3-base

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

tainc commited on about 1 month ago

Commit

5e2ad22

·

verified ·

1 Parent(s): 09bca8b

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -33,7 +33,7 @@ SEA-LION stands for <i>Southeast Asian Languages In One Network</i>.
 ## Model Details
 ### Model Description
-The continued pre-training data for Llama3.1 8B CPT SEA-LIONv3 Base encompasses approximately 200B tokens and includes the 11 official Southeast Asian languages: English, Chinese, Vietnamese, Indonesian, Thai, Tamil, Filipino, Malay, Khmer, Lao, Burmese.
 For tokenisation, the model employs the default tokenizer used in Llama3.1 8B Instruct.

 ## Model Details
 ### Model Description
+The continued pre-training data for Llama3.1 8B CPT SEA-LIONv3 Base encompasses approximately 200B tokens across the 11 official Southeast Asian languages: English, Chinese, Vietnamese, Indonesian, Thai, Tamil, Filipino, Malay, Khmer, Lao, Burmese.
 For tokenisation, the model employs the default tokenizer used in Llama3.1 8B Instruct.