NLLB-200 Distilled-350M_en2ko
The NLLB-200 model showed outstanding performance in translation task and contributed to solving problems with low-resource languages. Despite their efforts, it is still hard to run 600M or more than 1B model for those who have not enough computing environment. So I made much smaller model that expertized translaing English to Korean. you can also run it with cpu (No mixed-precision, No Quantization).
Model
Model: model is based on NLLB-200 600M
- Parameters: 350,537,728 (350M)
- Encoder layers: 12 -> 3
- Decoder layers: 12 -> 3
- FFN dimension: 4096 (same)
- Embed dimension: 1024 (same)
- Vocab size: 256206 (same)
Licnese: CC-BY-NC
Data
- Training Data: NLLB dataset
- Evaluation Data: Flores-200 dataset
Metric
- CPU: Intel (R) Xeon(R) CPU @ 2.20GHz (16 cores)
- GPU: NVIDIA L4 24GB
#Params | chrF(++) | GPU Inference time (s) | CPU Inference time (s) | |
---|---|---|---|---|
NLLB-200 3.3B | 3.3B | 34.3 | 0.98 s | 4.65 s |
NLLB-200 1.3B | 1.3B | 32.1 | 0.89 s | 2.46 s |
NLLB-200 600M | 600M | 32 | 0.43 s | 1.52 s |
NLLB-200 350M (ours) | 350M | 24.6 | 0.24 s | 1.43 s |
Usage
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
model = AutoModelForSeq2SeqLM.from_pretrained('dhtocks/nllb-200-distilled-350M_en-ko', forced_bos_token_id=256098)
tokenizer = AutoTokenizer.from_pretrained('dhtocks/nllb-200-distilled-350M_en-ko', src_lang='eng_Latn', tgt_lang='kor_Hang')
inputs = tokenizer('[YOUR_INPUT]', return_tensors="pt")
output = model.generate(**inputs)
print(tokenizer.decode(output[0]))
Citation
@misc{,
title={NLLB-200 distilled_350M_en-ko},
author={Saechan Oh},
year={2024}
}
- Downloads last month
- 128
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.