Update README.md
Browse files
README.md
CHANGED
@@ -39,7 +39,7 @@ widget:
|
|
39 |
Scientific Abstract Simplification-baseline *translates* hard-to-read scientific abstracts😵 into more accessible language😇. We hope it can make scientific knowledge accessible for everyone🤗.
|
40 |
|
41 |
Try it now with the Hosted inference API on the right.
|
42 |
-
You can choose an existing example or paste in any (perhaps full-of-jargon) abstract. Remember to prepend the instruction before the abstract ("summarize, simplify, and contextualize: "; notice, there is a whitespace after the colon). Local use refers to Section [Usage](
|
43 |
|
44 |
|
45 |
# Model Details
|
@@ -65,7 +65,7 @@ As an ongoing effort, we are working on re-contextualizating abstracts for bette
|
|
65 |
- **Developed by:**
|
66 |
- Mentors: Jason Clark and Hannah McKelvey
|
67 |
- Fellows: Haining Wang and Deanna Zarrillo
|
68 |
-
-
|
69 |
- **Language(s) (NLP):** English
|
70 |
- **License:** MIT
|
71 |
- **Parent Model:** [FLAN-T5-large](https://huggingface.co/google/flan-t5-large)
|
@@ -87,12 +87,11 @@ model = AutoModelForSeq2SeqLM.from_pretrained("haining/sas_baseline")
|
|
87 |
|
88 |
input_text = "The COVID-19 pandemic presented enormous data challenges in the United States. Policy makers, epidemiological modelers, and health researchers all require up-to-date data on the pandemic and relevant public behavior, ideally at fine spatial and temporal resolution. The COVIDcast API is our attempt to fill this need: Operational since April 2020, it provides open access to both traditional public health surveillance signals (cases, deaths, and hospitalizations) and many auxiliary indicators of COVID-19 activity, such as signals extracted from deidentified medical claims data, massive online surveys, cell phone mobility data, and internet search trends. These are available at a fine geographic resolution (mostly at the county level) and are updated daily. The COVIDcast API also tracks all revisions to historical data, allowing modelers to account for the frequent revisions and backfill that are common for many public health data sources. All of the data are available in a common format through the API and accompanying R and Python software packages. This paper describes the data sources and signals, and provides examples demonstrating that the auxiliary signals in the COVIDcast API present information relevant to tracking COVID activity, augmenting traditional public health reporting and empowering research and decision-making."
|
89 |
|
90 |
-
encoding = tokenizer(
|
91 |
-
|
92 |
-
|
93 |
-
|
94 |
-
|
95 |
-
return_tensors='pt')
|
96 |
decoded_ids = model.generate(input_ids=encoding['input_ids'],
|
97 |
attention_mask=encoding['attention_mask'],
|
98 |
max_new_tokens=512,
|
|
|
39 |
Scientific Abstract Simplification-baseline *translates* hard-to-read scientific abstracts😵 into more accessible language😇. We hope it can make scientific knowledge accessible for everyone🤗.
|
40 |
|
41 |
Try it now with the Hosted inference API on the right.
|
42 |
+
You can choose an existing example or paste in any (perhaps full-of-jargon) abstract. Remember to prepend the instruction before the abstract ("summarize, simplify, and contextualize: "; notice, there is a whitespace after the colon). Local use refers to Section [Usage](/Usage).
|
43 |
|
44 |
|
45 |
# Model Details
|
|
|
65 |
- **Developed by:**
|
66 |
- Mentors: Jason Clark and Hannah McKelvey
|
67 |
- Fellows: Haining Wang and Deanna Zarrillo
|
68 |
+
- [LEADING](https://cci.drexel.edu/mrc/leading/) Montana State University Library ("TL;DR it": Automating Article Synopses for Search Engine Optimization and Citizen Science).
|
69 |
- **Language(s) (NLP):** English
|
70 |
- **License:** MIT
|
71 |
- **Parent Model:** [FLAN-T5-large](https://huggingface.co/google/flan-t5-large)
|
|
|
87 |
|
88 |
input_text = "The COVID-19 pandemic presented enormous data challenges in the United States. Policy makers, epidemiological modelers, and health researchers all require up-to-date data on the pandemic and relevant public behavior, ideally at fine spatial and temporal resolution. The COVIDcast API is our attempt to fill this need: Operational since April 2020, it provides open access to both traditional public health surveillance signals (cases, deaths, and hospitalizations) and many auxiliary indicators of COVID-19 activity, such as signals extracted from deidentified medical claims data, massive online surveys, cell phone mobility data, and internet search trends. These are available at a fine geographic resolution (mostly at the county level) and are updated daily. The COVIDcast API also tracks all revisions to historical data, allowing modelers to account for the frequent revisions and backfill that are common for many public health data sources. All of the data are available in a common format through the API and accompanying R and Python software packages. This paper describes the data sources and signals, and provides examples demonstrating that the auxiliary signals in the COVIDcast API present information relevant to tracking COVID activity, augmenting traditional public health reporting and empowering research and decision-making."
|
89 |
|
90 |
+
encoding = tokenizer(INSTRUCTION + input_text,
|
91 |
+
max_length=672,
|
92 |
+
padding='max_length',
|
93 |
+
truncation=True,
|
94 |
+
return_tensors='pt')
|
|
|
95 |
decoded_ids = model.generate(input_ids=encoding['input_ids'],
|
96 |
attention_mask=encoding['attention_mask'],
|
97 |
max_new_tokens=512,
|