MikkelWK's picture
Create README.md
2204267 verified
|
raw
history blame
309 Bytes
---
library_name: tokenizers
tags: [Danish, Morphological Tokenization, CerebrasGPT]
---
### DA-MORPH-CEREBRAS-TOKEN
This morphological tokenizer is designed for the CerebrasGPT architecture and focuses on segmenting Danish text based on linguistic principles, enabling more meaningful subword tokenization.