metadata
license: cc-by-sa-4.0
language:
- ko
tags:
- korean
KoBigBird-RoBERTa-large
This is a large-sized Korean BigBird model introduced in our paper (publication link pending, to be presented at IJCNLP-AACL 2023). The model draws heavily from the parameters of klue/roberta-large to ensure high performance. By employing the BigBird architecture and incorporating the newly proposed TAPER, the language model accommodates even longer input lengths.
How to Use
from transformers import AutoTokenizer, AutoModelForMaskedLM
tokenizer = AutoTokenizer.from_pretrained("vaiv/kobigbird-roberta-large")
model = AutoModelForMaskedLM.from_pretrained("vaiv/kobigbird-roberta-large")
Hyperparameters
Results
Measurement on validation sets of the KLUE benchmark datasets
Limitations
While our model achieves great results even without additional pretraining, further pretraining can refine the positional representations more.
Citation Information
To Be Announced