arnosimons commited on
Commit
ad78659
·
verified ·
1 Parent(s): 72dba0a

Update README.md

Browse files

link to Astro-HEP Corpus

Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -63,7 +63,7 @@ tags:
63
 
64
  # Model Card for Astro-HEP-BERT
65
 
66
- **Astro-HEP-BERT** is a bidirectional transformer designed primarily to generate contextualized word embeddings for computational conceptual analysis in astrophysics and high-energy physics (HEP). Built upon Google's `bert-base-uncased`, the model underwent additional training for three epochs using 21.84 million paragraphs found in more than 600,000 scholarly articles sourced from arXiv, all pertaining to astrophysics and/or high-energy physics (HEP). The sole training objective was masked language modeling.
67
 
68
  The Astro-HEP-BERT project demonstrates the general feasibility of training a customized bidirectional transformer for computational conceptual analysis in the history, philosophy, and sociology of science as an open-source endeavor that does not require a substantial budget. Leveraging only freely available code, weights, and text inputs, the entire training process was conducted on a single MacBook Pro Laptop (M2/96GB).
69
 
 
63
 
64
  # Model Card for Astro-HEP-BERT
65
 
66
+ **Astro-HEP-BERT** is a bidirectional transformer designed primarily to generate contextualized word embeddings for computational conceptual analysis in astrophysics and high-energy physics (HEP). Built upon Google's `bert-base-uncased`, the model underwent additional training for three epochs using the <a target="_blank" rel="noopener noreferrer" href="https://huggingface.co/datasets/arnosimons/astro-hep-corpus">Astro-HEP Corpus</a>, containing 21.84 million paragraphs found in more than 600,000 scholarly articles sourced from arXiv, all pertaining to astrophysics and/or high-energy physics (HEP). The sole training objective was masked language modeling.
67
 
68
  The Astro-HEP-BERT project demonstrates the general feasibility of training a customized bidirectional transformer for computational conceptual analysis in the history, philosophy, and sociology of science as an open-source endeavor that does not require a substantial budget. Leveraging only freely available code, weights, and text inputs, the entire training process was conducted on a single MacBook Pro Laptop (M2/96GB).
69