File size: 636 Bytes
c56228a 94a8636 fc6ad7a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
---
license: apache-2.0
language:
- en
tags:
- medical
---
Dataset: https://www.kaggle.com/datasets/timmayer/covid-news-articles-2020-2022
Comprehensive guide can be found here: https://medium.com/@shankar.arunp/easily-build-your-own-gpt-from-scratch-using-aws-51811b6355d3
The model is GPT2 further pre-trained on the news articles to incorporate COVID-19 related context to the model.
Similar article on how to further pre-train a BERT base model from scratch using the articles can be found here: https://medium.com/@shankar.arunp/training-bert-from-scratch-on-your-custom-domain-data-a-step-by-step-guide-with-amazon-25fcbee4316a |