|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- CodeTed/CGEDit_dataset |
|
language: |
|
- zh |
|
metrics: |
|
- accuracy |
|
library_name: transformers |
|
tags: |
|
- CGED |
|
- CSC |
|
pipeline_tag: text2text-generation |
|
--- |
|
# CGEDit - Chinese Grammatical Error Diagnosis by Task-Specific Instruction Tuning |
|
|
|
Try the model from this space "[Chinese Grammarly](https://huggingface.co/spaces/CodeTed/Chinese-Grammarly)". |
|
|
|
This model was obtained by fine-tuning the corresponding `ClueAI/PromptCLUE-base-v1-5` model on the CoEdIT dataset. |
|
 |
|
|
|
|
|
## Model Details |
|
### Model Description |
|
- Language(s) (NLP): `Chinese` |
|
- Finetuned from model: `ClueAI/PromptCLUE-base-v1-5` |
|
### Model Sources |
|
- Repository: [https://github.com/TedYeh/Chinese_spelling_Correction](https://github.com/TedYeh/Chinese_spelling_Correction) |
|
|
|
## Usage |
|
```python |
|
from transformers import AutoTokenizer, T5ForConditionalGeneration |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("CodeTed/Chinese_Grammarly") |
|
model = T5ForConditionalGeneration.from_pretrained("CodeTed/Chinese_Grammarly") |
|
input_text = '糾正句子裡的錯字: 看完那段文張,我是反對的!' |
|
input_ids = tokenizer(input_text, return_tensors="pt").input_ids |
|
outputs = model.generate(input_ids, max_length=256) |
|
edited_text = tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
``` |