Update README.md
Browse files
README.md
CHANGED
@@ -12,6 +12,8 @@ GatorTron-Base is pre-trained using a dataset consisting of:
|
|
12 |
- 2.5B words from WikiText,
|
13 |
- 0.5B words of de-identified clinical notes from MIMIC-III
|
14 |
|
|
|
|
|
15 |
<h2>De-identification</h2>
|
16 |
|
17 |
We applied a de-identification system to remove protected health information (PHI) from clinical text. We adopted the safe-harbor method to identify 18 PHI categories defined in the Health Insurance Portability and Accountability Act (HIPAA) and replaced them with dummy strings (e.g., replace people’s names into [\*\*NAME\*\*]).
|
|
|
12 |
- 2.5B words from WikiText,
|
13 |
- 0.5B words of de-identified clinical notes from MIMIC-III
|
14 |
|
15 |
+
The Github for GatorTron is at : https://github.com/uf-hobi-informatics-lab/GatorTron
|
16 |
+
|
17 |
<h2>De-identification</h2>
|
18 |
|
19 |
We applied a de-identification system to remove protected health information (PHI) from clinical text. We adopted the safe-harbor method to identify 18 PHI categories defined in the Health Insurance Portability and Accountability Act (HIPAA) and replaced them with dummy strings (e.g., replace people’s names into [\*\*NAME\*\*]).
|