language:
- en
datasets:
- pubmed
- ml4pubmed/pubmed-classification-20k
metrics:
- f1
tags:
- text-classification
- document sections
- sentence classification
- document classification
- medical
- health
- biomedical
pipeline_tag: text-classification
widget:
- text: >-
many pathogenic processes and diseases are the result of an erroneous
activation of the complement cascade and a number of inhibitors of
complement have thus been examined for anti-inflammatory actions.
example_title: background example
- text: a total of 192 mi patients and 140 control persons were included.
example_title: methods example
- text: >-
mi patients had 18 % higher plasma levels of map44 (iqr 11-25 %) as
compared to the healthy control group (p < 0. 001.)
example_title: results example
- text: >-
the finding that a brief cb group intervention delivered by real-world
providers significantly reduced mdd onset relative to both brochure
control and bibliotherapy is very encouraging, although effects on
continuous outcome measures were small or nonsignificant and approximately
half the magnitude of those found in efficacy research, potentially
because the present sample reported lower initial depression.
example_title: conclusions example
- text: >-
in order to understand and update the prevalence of myopia in taiwan, a
nationwide survey was performed in 1995.
example_title: objective example
license: apache-2.0
BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext_pub_section
- original model file name: textclassifer_BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext_pubmed_20k
- This is a fine-tuned checkpoint of
microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext
for document section text classification - possible document section classes are:BACKGROUND, CONCLUSIONS, METHODS, OBJECTIVE, RESULTS,
usage in python
install transformers as needed:
pip install -U transformers
Run the following, changing the example text to your use case:
from transformers import pipeline
model_tag = "ml4pubmed/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext_pub_section"
classifier = pipeline(
'text-classification',
model=model_tag,
)
prompt = """
Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train.
"""
classifier(
prompt,
) # classify the sentence
metadata
training_metrics
val_accuracy: 0.8678670525550842
val_matthewscorrcoef: 0.8222037553787231
val_f1score: 0.866841197013855
val_cross_entropy: 0.3674609065055847
epoch: 8.0
train_accuracy_step: 0.83984375
train_matthewscorrcoef_step: 0.7790813446044922
train_f1score_step: 0.837363600730896
train_cross_entropy_step: 0.39843088388442993
train_accuracy_epoch: 0.8538406491279602
train_matthewscorrcoef_epoch: 0.8031334280967712
train_f1score_epoch: 0.8521654605865479
train_cross_entropy_epoch: 0.4116102457046509
test_accuracy: 0.8578397035598755
test_matthewscorrcoef: 0.8091378808021545
test_f1score: 0.8566917181015015
test_cross_entropy: 0.3963385224342346
date_run: Apr-22-2022_t-19
huggingface_tag: microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext