recruit-jp
/

japanese-typo-detector-roberta-base

Token Classification

Inference Endpoints

Model card Files Files and versions Community

keisuke-kiryu commited on Nov 17, 2023

Commit

ba37e17

·

1 Parent(s): ba95365

Update README.md

Files changed (1) hide show

README.md +20 -1

README.md CHANGED Viewed

@@ -23,7 +23,26 @@ widget:
 # モデルの使い方
   ```python
-    import transformers
   ```
 # 学習データ

 # モデルの使い方
   ```python
+    from transformers import AutoTokenizer,AutoModelForTokenClassification
+    model_name('recruit-jp/japanese-typo-detector-roberta-base')
+    tokenizer = AutoTokenizer.from_pretrained(model_name)
+    model = AutoModelForTokenClassification.from_pretrained(model_name)
+    device = "cuda:0" if torch.cuda.is_available() else "cpu"
+    model = model.to(device)
+    in_text = "これは日本語の誤植を検出する真相学習モデルです。"
+    test_inputs = tokenizer(in_text, return_tensors='pt').get('input_ids')
+    test_outputs = model(test_inputs.to(torch.device(device)))
+    for chara, logit in zip(["[CLS]"] + list(in_text) + ["[SEP]"], test_outputs.logits.squeeze().tolist()):
+    err_type_ind = np.argmax(logit)
+    err_name = model.config.id2label[err_type_ind]
+    err_desc = f"★誤字(err_index={err_type_ind}, err_name={err_name})" if err_type_ind > 0 else f""
+    print(f"{chara} : {err_desc}")
   ```
 # 学習データ