haining
/

sas_baseline

Text2Text Generation

text2text generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

haining commited on Dec 2, 2022

Commit

ca14f54

·

1 Parent(s): 5c8f6fa

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -66,7 +66,7 @@ As an ongoing effort, we are working on re-contextualizating abstracts for bette
 - **Model type:** Language model
 - **Developed by:**
   - PIs: Jason Clark and Hannah McKelvey
-  - Fellows: Haining Wang and Deanna Zarrillo
   - [LEADING](https://cci.drexel.edu/mrc/leading/) Montana State University Library, Project "TL;DR it": Automating Article Synopses for Search Engine Optimization and Citizen Science
 - **Language(s) (NLP):** English
 - **License:** MIT
@@ -120,7 +120,7 @@ For SAS-baseline, we finetuned Flan-T5 model with the Scientific Abstract-Signif
 ## Setup
 We finetuned the base model with a standard language modeling objective: the abstracts are sources and the significance statements are targets. We inform the model with a task-spcific prefix ("summarize, simplify, and contextualize: ") during training. The training took roughly 9 hours on two NVIDIA RTX A5000 (24GB memory each) GPUs. We saved the checkpoint with the lowest validation loss for inference. We used the AdamW optimizer and a learning rate of 3e-5 with fully sharded data parallel strategy. The model (\~780M parameter) was trained on Nov. 20, 2022.
-Notice, the readability of the signifiance statements is generally lower than the abstracts', but not by a large margin. Our incoming SAS-full model will leverage more corpora for scientific (re)contexutualization, summarization, and simplification.
 # Evaluation

 - **Model type:** Language model
 - **Developed by:**
   - PIs: Jason Clark and Hannah McKelvey
+  - Fellow: Haining Wang
   - [LEADING](https://cci.drexel.edu/mrc/leading/) Montana State University Library, Project "TL;DR it": Automating Article Synopses for Search Engine Optimization and Citizen Science
 - **Language(s) (NLP):** English
 - **License:** MIT
 ## Setup
 We finetuned the base model with a standard language modeling objective: the abstracts are sources and the significance statements are targets. We inform the model with a task-spcific prefix ("summarize, simplify, and contextualize: ") during training. The training took roughly 9 hours on two NVIDIA RTX A5000 (24GB memory each) GPUs. We saved the checkpoint with the lowest validation loss for inference. We used the AdamW optimizer and a learning rate of 3e-5 with fully sharded data parallel strategy. The model (\~780M parameter) was trained on Nov. 20, 2022.
+Notice, the readability of the significance statements is generally lower than the abstracts', but not by a large margin. Our incoming SAS-full model will leverage more corpora for scientific (re)contextualization, summarization, and simplification.
 # Evaluation