|
--- |
|
license: mit |
|
datasets: |
|
- bigbio/chemdner |
|
- ncbi_disease |
|
- jnlpba |
|
- bigbio/n2c2_2018_track2 |
|
- bigbio/bc5cdr |
|
language: |
|
- en |
|
metrics: |
|
- precision |
|
- recall |
|
- f1 |
|
pipeline_tag: token-classification |
|
tags: |
|
- token-classification |
|
- biology |
|
- medical |
|
- zero-shot |
|
- few-shot |
|
library_name: transformers |
|
--- |
|
# Zero and few shot NER for biomedical texts |
|
|
|
## Model description |
|
Model takes as input two strings. String1 is NER label. String1 must be phrase for entity. String2 is short text where String1 is searched for semantically. |
|
model outputs list of zeros and ones corresponding to the occurance of Named Entity and corresponing to the tokens(tokens given by transformer tokenizer) of the Sring2. |
|
|
|
## Example of usage |
|
``` |
|
from transformers import AutoTokenizer |
|
from transformers import BertForTokenClassification |
|
|
|
modelname = 'ProdicusII/ZeroShotBioNER' # modelpath |
|
tokenizer = AutoTokenizer.from_pretrained(modelname) ## loading the tokenizer of that model |
|
string1 = 'Drug' |
|
string2 = 'No recent antibiotics or other nephrotoxins, and no symptoms of UTI with benign UA.' |
|
encodings = tokenizer(string1, string2, is_split_into_words=False, |
|
padding=True, truncation=True, add_special_tokens=True, return_offsets_mapping=False, |
|
max_length=512, return_tensors='pt') |
|
|
|
model = BertForTokenClassification.from_pretrained(modelname, num_labels=2) |
|
prediction_logits = model(**encodings) |
|
print(prediction_logits) |
|
``` |
|
|
|
## Code availibility |
|
|
|
Code used for training and testing the model is available at https://github.com/br-ai-ns-institute/Zero-ShotNER |
|
|
|
## Citation |