Fill-Mask
Transformers
PyTorch
English
bert
Inference Endpoints
sequoiaandrade commited on
Commit
6d3e2ce
·
1 Parent(s): 265a3cf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -5
README.md CHANGED
@@ -13,17 +13,15 @@ widget:
13
  - text: "UAS was climbing to 11,000 ft. msl on a reconnaissance mission when it experienced a rapid and uncommanded descent. The [MASK] took no action but monitored instruments until the aircraft regained a stable profile. "
14
  example_title: "Example 3"
15
  ---
16
- # Note: Model appears to not be working for MLM currently
17
-
18
  # Manager for Intelligent Knowledge Acess (MIKA)
19
 
20
  # SafeAeroBERT: A Safety-Informed Aviation-Specific Langauge Model
21
 
22
- base-bert-uncased model first further pre-trained on the set of Aviation Safety Reporting System (ASRS) documents up to November of 2022 and National Trasportation Safety Board (NTSB) accident reports up to November 2022.A total of 2,283,435 narrative sections and 1,169,118,720 tokens from over 400,000 NTSB and ASRS documents are used for model pre-training.
23
 
24
- The model was trained on just over two epochs using `AutoModelForMaskedLM.from_pretrained` with a `learning_rate=1e-5`, and total batch size of 384 for just over 13000 training steps.
25
 
26
- The model was evaluted on a downstream binary document classification task by fine-tuning the model with `AutoModelForSequenceClassification.from_pretrained`. SafeAeroBERT was compared to SciBERT and base-BERT on this task, with the following performance:
27
 
28
 
29
  |Contributing Factor | Metric |BERT | SciBERT | SafeAeroBERT|
 
13
  - text: "UAS was climbing to 11,000 ft. msl on a reconnaissance mission when it experienced a rapid and uncommanded descent. The [MASK] took no action but monitored instruments until the aircraft regained a stable profile. "
14
  example_title: "Example 3"
15
  ---
 
 
16
  # Manager for Intelligent Knowledge Acess (MIKA)
17
 
18
  # SafeAeroBERT: A Safety-Informed Aviation-Specific Langauge Model
19
 
20
+ base-bert-uncased model first further pre-trained on the set of Aviation Safety Reporting System (ASRS) documents up to November of 2022 and National Trasportation Safety Board (NTSB) accident reports up to November 2022. A total of 2,283,435 narrative sections are split 90/10 for training and validation, with 1,052,207,104 tokens from over 350,000 NTSB and ASRS documents used for pre-training.
21
 
22
+ The model was trained on two epochs using `AutoModelForMaskedLM.from_pretrained` with a `learning_rate=1e-5`, and total batch size of 128 for just over 32100 training steps.
23
 
24
+ An earlier version of the model was evaluted on a downstream binary document classification task by fine-tuning the model with `AutoModelForSequenceClassification.from_pretrained`. SafeAeroBERT was compared to SciBERT and base-BERT on this task, with the following performance:
25
 
26
 
27
  |Contributing Factor | Metric |BERT | SciBERT | SafeAeroBERT|