File size: 1,125 Bytes
1383a1e
e4d3c39
 
 
 
 
 
 
1383a1e
e4d3c39
 
 
1383a1e
e4d3c39
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8c30ebb
e4d3c39
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
---
language: 
- zh
tags:
- chatglm
- pytorch
- zh
- Text2Text-Generation
license: bigscience-bloom-rail-1.0
widget:
- text: "对下面中文拼写纠错:\n少先队员因该为老人让坐。\n答:"

---

# Chinese language Model(kenlm)
kenlm language model:

- big model: zh_giga.no_cna_cmn.prune01244.klm
- small model: people2014corpus_chars.klm
## Usage

本项目开源在 pycorrector 项目:[pycorrector](https://github.com/shibing624/pycorrector),可支持kenlm模型,通过如下命令调用:

Install package:
```shell
pip install -U pycorrector
```

```python
from pycorrector import Corrector
model = Corrector(language_model_path='people2014corpus_chars.klm')
print(model.correct('少先队员因该为老人让坐')) # ['少先队员应该为老人让座。']
```

如果需要训练文本纠错模型,请参考[https://github.com/shibing624/pycorrector](https://github.com/shibing624/pycorrector)



## Citation

```latex
@software{pycorrector,
  author = {Ming Xu},
  title = {pycorrector: Text Error Correction Tool},
  year = {2023},
  url = {https://github.com/shibing624/pycorrector},
}
```