--- license: cc-by-nc-3.0 language: - da pipeline_tag: fill-mask tags: - bert - danish widget: - text: Hvide blodlegemer beskytter kroppen mod [MASK] --- # Danish medical BERT MeDa-BERT was initialized with weights from a [pretrained Danish BERT model](https://huggingface.co/Maltehb/danish-bert-botxo) and pretrained for 48 epochs using the MLM objective on a Danish medical corpus of 123M tokens. The development of the corpus and model is described further in [this paper](https://aclanthology.org/2023.nodalida-1.31/). Here is an example on how to load the model in PyTorch using the [🤗Transformers](https://github.com/huggingface/transformers) library: ```python from transformers import AutoTokenizer, AutoModelForMaskedLM tokenizer = AutoTokenizer.from_pretrained("indsigt-ai/MeDa-BERT") model = AutoModelForMaskedLM.from_pretrained("indsigt-ai/MeDa-BERT") ``` ### Citing ``` @inproceedings{pedersen-etal-2023-meda, title = "{M}e{D}a-{BERT}: A medical {D}anish pretrained transformer model", author = "Pedersen, Jannik and Laursen, Martin and Vinholt, Pernille and Savarimuthu, Thiusius Rajeeth", booktitle = "Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa)", month = may, year = "2023", address = "T{\'o}rshavn, Faroe Islands", publisher = "University of Tartu Library", url = "https://aclanthology.org/2023.nodalida-1.31", pages = "301--307", } ```