| | --- |
| | license: apache-2.0 |
| | datasets: |
| | - CodeTed/CGEDit_dataset |
| | language: |
| | - zh |
| | metrics: |
| | - accuracy |
| | library_name: transformers |
| | tags: |
| | - CGED |
| | - CSC |
| | pipeline_tag: text2text-generation |
| | --- |
| | # CGEDit - Chinese Grammatical Error Diagnosis by Task-Specific Instruction Tuning |
| |
|
| | Try the model from this space "[Chinese Grammarly](https://huggingface.co/spaces/CodeTed/Chinese-Grammarly)". |
| |
|
| | This model was obtained by fine-tuning the corresponding `ClueAI/PromptCLUE-base-v1-5` model on the CoEdIT dataset. |
| |  |
| |
|
| |
|
| | ## Model Details |
| | ### Model Description |
| | - Language(s) (NLP): `Chinese` |
| | - Finetuned from model: `ClueAI/PromptCLUE-base-v1-5` |
| | ### Model Sources |
| | - Repository: [https://github.com/TedYeh/Chinese_spelling_Correction](https://github.com/TedYeh/Chinese_spelling_Correction) |
| |
|
| | ## Usage |
| | ```python |
| | from transformers import AutoTokenizer, T5ForConditionalGeneration |
| | |
| | tokenizer = AutoTokenizer.from_pretrained("CodeTed/Chinese_Grammarly") |
| | model = T5ForConditionalGeneration.from_pretrained("CodeTed/Chinese_Grammarly") |
| | input_text = '糾正句子裡的錯字: 看完那段文張,我是反對的!' |
| | input_ids = tokenizer(input_text, return_tensors="pt").input_ids |
| | outputs = model.generate(input_ids, max_length=256) |
| | edited_text = tokenizer.decode(outputs[0], skip_special_tokens=True) |
| | ``` |