πŸ“ˆ Financial Korean ELECTRA model

Pretrained ELECTRA Language Model for Korean (finance-koelectra-small-discriminator)

ELECTRA is a new method for self-supervised language representation learning. It can be used to pre-train transformer networks using relatively little compute. ELECTRA models are trained to distinguish "real" input tokens vs "fake" input tokens generated by another neural network, similar to the discriminator of a GAN.

More details about ELECTRA can be found in the ICLR paper or in the official ELECTRA repository on GitHub.

Stats

The current version of the model is trained on a financial news data of Naver news.

The final training corpus has a size of 25GB and 2.3B tokens.

This model was trained a cased model on a TITAN RTX for 500k steps.

Usage

from transformers import ElectraForPreTraining, ElectraTokenizer
import torch
discriminator = ElectraForPreTraining.from_pretrained("krevas/finance-koelectra-small-discriminator")
tokenizer = ElectraTokenizer.from_pretrained("krevas/finance-koelectra-small-discriminator")
sentence = "내일 ν•΄λ‹Ή μ’…λͺ©μ΄ λŒ€ν­ μƒμŠΉν•  것이닀"
fake_sentence = "내일 ν•΄λ‹Ή μ’…λͺ©μ΄ λ§›μžˆκ²Œ μƒμŠΉν•  것이닀"
fake_tokens = tokenizer.tokenize(fake_sentence)
fake_inputs = tokenizer.encode(fake_sentence, return_tensors="pt")
discriminator_outputs = discriminator(fake_inputs)
predictions = torch.round((torch.sign(discriminator_outputs[0]) + 1) / 2)
[print("%7s" % token, end="") for token in fake_tokens]
[print("%7s" % int(prediction), end="") for prediction in predictions.tolist()[1:-1]]
print("fake token : %s" % fake_tokens[predictions.tolist()[1:-1].index(1)])

Huggingface model hub

All models are available on the Huggingface model hub.

Downloads last month
110
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.