MKLLM-7B / README.md
trajkovnikola's picture
Update README.md
e0a021b verified
metadata
license: apache-2.0
language:
  - mk
  - en
tags:
  - axolotl

MKLLM-7B

MKLLM-7B is an open-source Large Language Model for the Macedonian language. The model is built on top of the amazing Mistral-7B-v0.1 model by continued pretraining on a mix of Macedonian and English text. A corpus of around 300M tokens, repeated in 2 epochs, was used for the training and even though this might be considered small compared to other similar projects, the resulting model is very capable in understanding and processing the Macedonian language.

We have built two instruction models on top of the base model which showcase the potential of the model.

  1. MKLLM-7B-Instruct: An instruction-tuned that performs better than leading models from the same size:

image/png

  1. MKLLM-7B-Translate: An LLM as a translator implementation that has quite an impressive performance:

image/png

Notes

  • MKLLM-7B is a base model and is not intended for deployment without fine-tuning. The model has no moderation mechanisms.
  • MKLLM-7B can hallucinate and produce factually incorrect output. This is especially pronounced when discussing Macedonian topics due to the smaller training dataset.

Built with Axolotl