ali619's picture
Update tokenizer with fa-en corpus dataset
42655c6 verified