metadata

language:
  - en
datasets:
  - pubmed
  - ml4pubmed/pubmed-classification-20k
metrics:
  - f1
tags:
  - text-classification
  - document sections
  - sentence classification
  - document classification
  - medical
  - health
  - biomedical
pipeline_tag: text-classification
widget:
  - text: >-
      many pathogenic processes and diseases are the result of an erroneous
      activation of the complement cascade and a number of inhibitors of
      complement have thus been examined for anti-inflammatory actions.
    example_title: background example
  - text: a total of 192 mi patients and 140 control persons were included.
    example_title: methods example
  - text: >-
      mi patients had 18 % higher plasma levels of map44 (iqr 11-25 %) as
      compared to the healthy control group (p < 0. 001.)
    example_title: results example
  - text: >-
      the finding that a brief cb group intervention delivered by real-world
      providers significantly reduced mdd onset relative to both brochure
      control and bibliotherapy is very encouraging, although effects on
      continuous outcome measures were small or nonsignificant and approximately
      half the magnitude of those found in efficacy research, potentially
      because the present sample reported lower initial depression.
    example_title: conclusions example
  - text: >-
      in order to understand and update the prevalence of myopia in taiwan, a
      nationwide survey was performed in 1995.
    example_title: objective example
license: apache-2.0

BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext_pub_section

original model file name: textclassifer_BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext_pubmed_20k
This is a fine-tuned checkpoint of microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext for document section text classification
possible document section classes are:BACKGROUND, CONCLUSIONS, METHODS, OBJECTIVE, RESULTS,

usage in python

install transformers as needed:

pip install -U transformers

Run the following, changing the example text to your use case:

from transformers import pipeline

model_tag = "ml4pubmed/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext_pub_section"
classifier = pipeline(
              'text-classification', 
              model=model_tag, 
            )
            
prompt = """
Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train.
"""

classifier(
    prompt,
) # classify the sentence

metadata

training_metrics

val_accuracy: 0.8678670525550842
val_matthewscorrcoef: 0.8222037553787231
val_f1score: 0.866841197013855
val_cross_entropy: 0.3674609065055847
epoch: 8.0
train_accuracy_step: 0.83984375
train_matthewscorrcoef_step: 0.7790813446044922
train_f1score_step: 0.837363600730896
train_cross_entropy_step: 0.39843088388442993
train_accuracy_epoch: 0.8538406491279602
train_matthewscorrcoef_epoch: 0.8031334280967712
train_f1score_epoch: 0.8521654605865479
train_cross_entropy_epoch: 0.4116102457046509
test_accuracy: 0.8578397035598755
test_matthewscorrcoef: 0.8091378808021545
test_f1score: 0.8566917181015015
test_cross_entropy: 0.3963385224342346
date_run: Apr-22-2022_t-19
huggingface_tag: microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext