Fill-Mask
Transformers
PyTorch
bert
Inference Endpoints
manchuBERT / README.md
seemdog's picture
Update README.md
e22abe9 verified
metadata
license: apache-2.0
pipeline_tag: fill-mask

manchuBERT

manchuBERT is a BERT-base model trained with romanized Manchu data from scratch.
ManNER & ManPOS are fine-tuned manchuBERT models.

Data

manchuBERT utilizes the data augmentation method from Mergen: The First Manchu-Korean Machine Translation Model Trained on Augmented Data.

Data Number of Sentences(before augmentation)
Manwén Lˇaodàng–Taizong 2,220
Ilan gurun i bithe 41,904
Gin ping mei bithe 21,376
Yùzhì Q¯ıngwénjiàn 11,954
Yùzhì Zengdìng Q¯ıngwénjiàn 18,420
Manwén Lˇaodàng–Taizu 22,578
Manchu-Korean Dictionary 40,583

Citation

@misc {jean_seo_2024,
    author       = { {Jean Seo} },
    title        = { manchuBERT (Revision 64133be) },
    year         = 2024,
    url          = { https://huggingface.co/seemdog/manchuBERT },
    doi          = { 10.57967/hf/1599 },
    publisher    = { Hugging Face }
}