UMCU's picture
Update README.md
c3c6741 verified
metadata
tags:
  - arxiv:2408.06930
  - medical
language:
  - nl
license: gpl-3.0
model-index:
  - name: Echocardiogram_aortic_stenosis_reduced
    results:
      - task:
          type: text-classification
        dataset:
          type: test
          name: internal test set
        metrics:
          - name: Macro f1
            type: f1
            value: 0.939
            verified: false
          - name: Macro precision
            type: precision
            value: 0.925
            verified: false
          - name: Macro recall
            type: recall
            value: 0.954
            verified: false
pipeline_tag: text-classification
metrics:
  - f1
  - precision
  - recall

Description

This model is a MedRoBERTa.nl model finetuned on Dutch echocardiogram reports sourced from Electronic Health Records. The publication associated with the span classification task can be found at https://arxiv.org/abs/2408.06930. The config file for training the model can be found at https://github.com/umcu/echolabeler.

Minimum working example

from transformer import pipeline
le_pipe = pipeline(model="UMCU/Echocardiogram_aortic_stenosis_reduced")
document = "Lorem ipsum"
results = le_pipe(document)

Label Scheme

View label scheme
Component Labels
reduced No label, Normal, Not Normal

Here, for the reduced labels Present means that for any one or multiple of the pathologies we have a positive result.

Here, for the pathologies we have

View pathologies
Annotation Pathology
pe Pericardial Effusion
wma Wall Motion Abnormality
lv_dil Left Ventricle Dilation
rv_dil Right Ventricle Dilation
lv_syst_func Left Ventricle Systolic Dysfunction
rv_syst_func Right Ventricle Systolic Dysfunction
lv_dias_func Diastolic Dysfunction
aortic_valve_native_stenosis Aortic Stenosis
mitral_valve_native_regurgitation Mitral valve regurgitation
tricuspid_valve_native_regurgitation Tricuspid regurgitation
aortic_valve_native_regurgitation Aortic Regurgitation

Note: lv_dias_func should have been dias_func..

Intended use

The model is developed for document classification of Dutch clinical echocardiogram reports. Since it is a domain-specific model trained on medical data, it is only meant to be used on medical NLP tasks for Dutch echocardiogram reports.

Data

The model was trained on approximately 4,000 manually annotated echocardiogram reports from the University Medical Centre Utrecht. The training data was anonymized before starting the training procedure.

Feature Description
Name Echocardiogram_SpanCategorizer_aortic_stenosis
Version 1.0.0
transformers >=4.40.0
Default Pipeline pipeline, text-classification
Components RobertaForSequenceClassification
License cc-by-sa-4.0
Author Bram van Es

Contact

If you are having problems with this model please add an issue on our git: https://github.com/umcu/echolabeler/issues

Usage

If you use the model in your work please use the following referral; https://doi.org/10.48550/arXiv.2408.06930

References

Paper: Bauke Arends, Melle Vessies, Dirk van Osch, Arco Teske, Pim van der Harst, René van Es, Bram van Es (2024): Diagnosis extraction from unstructured Dutch echocardiogram reports using span- and document-level characteristic classification, Arxiv https://arxiv.org/abs/2408.06930