Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
zirui3
/
llm-multilingual-tokenizer
like
0
Model card
Files
Files and versions
Community
6955647
llm-multilingual-tokenizer
/
README.md
zirui3
Upload README.md
3315142
over 1 year ago
preview
code
|
raw
Copy download link
history
blame
Safe
143 Bytes
# summary
multilingual tokenizer trained on multilingual data by using the SentencePiece library and the BPE algorithm.
*
vocab size: 100k