Update README.md
Browse files
README.md
CHANGED
@@ -66,7 +66,7 @@ As an ongoing effort, we are working on re-contextualizating abstracts for bette
|
|
66 |
- **Model type:** Language model
|
67 |
- **Developed by:**
|
68 |
- PIs: Jason Clark and Hannah McKelvey
|
69 |
-
-
|
70 |
- [LEADING](https://cci.drexel.edu/mrc/leading/) Montana State University Library, Project "TL;DR it": Automating Article Synopses for Search Engine Optimization and Citizen Science
|
71 |
- **Language(s) (NLP):** English
|
72 |
- **License:** MIT
|
@@ -120,7 +120,7 @@ For SAS-baseline, we finetuned Flan-T5 model with the Scientific Abstract-Signif
|
|
120 |
## Setup
|
121 |
|
122 |
We finetuned the base model with a standard language modeling objective: the abstracts are sources and the significance statements are targets. We inform the model with a task-spcific prefix ("summarize, simplify, and contextualize: ") during training. The training took roughly 9 hours on two NVIDIA RTX A5000 (24GB memory each) GPUs. We saved the checkpoint with the lowest validation loss for inference. We used the AdamW optimizer and a learning rate of 3e-5 with fully sharded data parallel strategy. The model (\~780M parameter) was trained on Nov. 20, 2022.
|
123 |
-
Notice, the readability of the
|
124 |
|
125 |
|
126 |
# Evaluation
|
|
|
66 |
- **Model type:** Language model
|
67 |
- **Developed by:**
|
68 |
- PIs: Jason Clark and Hannah McKelvey
|
69 |
+
- Fellow: Haining Wang
|
70 |
- [LEADING](https://cci.drexel.edu/mrc/leading/) Montana State University Library, Project "TL;DR it": Automating Article Synopses for Search Engine Optimization and Citizen Science
|
71 |
- **Language(s) (NLP):** English
|
72 |
- **License:** MIT
|
|
|
120 |
## Setup
|
121 |
|
122 |
We finetuned the base model with a standard language modeling objective: the abstracts are sources and the significance statements are targets. We inform the model with a task-spcific prefix ("summarize, simplify, and contextualize: ") during training. The training took roughly 9 hours on two NVIDIA RTX A5000 (24GB memory each) GPUs. We saved the checkpoint with the lowest validation loss for inference. We used the AdamW optimizer and a learning rate of 3e-5 with fully sharded data parallel strategy. The model (\~780M parameter) was trained on Nov. 20, 2022.
|
123 |
+
Notice, the readability of the significance statements is generally lower than the abstracts', but not by a large margin. Our incoming SAS-full model will leverage more corpora for scientific (re)contextualization, summarization, and simplification.
|
124 |
|
125 |
|
126 |
# Evaluation
|