Model Description

BioTinyBERT is the result of training the TinyBERT model in a continual learning fashion for 200k training steps using a total batch size of 192 on the PubMed dataset.

Initialisation

We initialise our model with the pre-trained checkpoints of the TinyBERT model available on Huggingface.

Architecture

This model uses 4 hidden layers with a hidden dimension size and an embedding size of 768 resulting in a total of 15M parameters.

Citation

If you use this model, please consider citing the following paper:

@article{rohanian2023effectiveness,
  title={On the effectiveness of compact biomedical transformers},
  author={Rohanian, Omid and Nouriborji, Mohammadmahdi and Kouchaki, Samaneh and Clifton, David A},
  journal={Bioinformatics},
  volume={39},
  number={3},
  pages={btad103},
  year={2023},
  publisher={Oxford University Press}
}