SungJoo
/

medical-ner-koelectra

Feature Extraction

Inference Endpoints

Model card Files Files and versions Community

medical-ner-koelectra / README.md

SungJoo's picture

Update README.md

cf6588f verified 8 months ago

|

history blame contribute delete

2.97 kB

	---
	license: apache-2.0
	datasets:
	- SungJoo/KBMC
	language:
	- ko
	library_name: transformers
	tags:
	- medical
	- NER
	---


	# Model Card for medical-ner-koelectra

	## Model Summary

	This model is a fine-tuned version of the [monologg/koelectra-base-v3-discriminator](https://huggingface.co/monologg/koelectra-base-v3-discriminator).

	We fine-tuned the model using the KBMC and [Naver X Changwon Univ NER dataset](https://ko-nlp.github.io/Korpora/ko-docs/corpuslist/naver_changwon_ner.html) datasets.

	## Model Details

	### Model Description

	- Developed by: Sungjoo Byun (Grace Byun)
	- Language(s) (NLP): Korean
	- License: Apache 2.0
	- Finetuned from model: monologg/koelectra-base-v3-discriminator


	## Training Data

	The model was trained using the dataset [Naver X Changwon Univ NER dataset](https://ko-nlp.github.io/Korpora/ko-docs/corpuslist/naver_changwon_ner.html) and [Korean Bio-Medical Corpus (KBMC)](https://huggingface.co/datasets/SungJoo/KBMC).

	# Model Performance

	## Overall Metrics

	- F1 Score: 0.8886
	- Loss: 0.2949
	- Precision: 0.8844
	- Recall: 0.8928

	## Class-wise Performance

	\| Class \| Precision \| Recall \| F1-Score \| Support \|
	\|-------------\|-----------\|--------\|----------\|---------\|
	\| AFW \| 0.6676 \| 0.6326 \| 0.6496 \| 362 \|
	\| ANM \| 0.7476 \| 0.7800 \| 0.7635 \| 600 \|
	\| Body \| 0.9731 \| 0.9813 \| 0.9772 \| 1068 \|
	\| CVL \| 0.8492 \| 0.8579 \| 0.8536 \| 4977 \|
	\| DAT \| 0.9078 \| 0.9286 \| 0.9181 \| 2130 \|
	\| Disease \| 0.9738 \| 0.9872 \| 0.9805 \| 2109 \|
	\| EVT \| 0.7332 \| 0.7446 \| 0.7389 \| 1026 \|
	\| FLD \| 0.6138 \| 0.6170 \| 0.6154 \| 188 \|
	\| LOC \| 0.8721 \| 0.8691 \| 0.8706 \| 1734 \|
	\| MAT \| 0.5385 \| 0.5000 \| 0.5185 \| 14 \|
	\| NUM \| 0.9227 \| 0.9305 \| 0.9266 \| 4660 \|
	\| ORG \| 0.8917 \| 0.8866 \| 0.8892 \| 3307 \|
	\| PER \| 0.8918 \| 0.9049 \| 0.8983 \| 3626 \|
	\| PLT \| 0.2941 \| 0.2174 \| 0.2500 \| 23 \|
	\| TIM \| 0.8644 \| 0.9173 \| 0.8901 \| 278 \|
	\| Treatment \| 0.9468 \| 0.9852 \| 0.9656 \| 271 \|

	## Averages

	\| Metric \| Micro Avg \| Macro Avg \| Weighted Avg \|
	\|----------------\|-----------\|-----------\|--------------\|
	\| Precision \| 0.8844 \| 0.7930 \| 0.8841 \|
	\| Recall \| 0.8928 \| 0.7963 \| 0.8928 \|
	\| F1-Score \| 0.8886 \| 0.7941 \| 0.8884 \|


	## Citations

	Please cite our KBMC paper:

	```bibtex
	@misc{byun2024korean,
	title={Korean Bio-Medical Corpus (KBMC) for Medical Named Entity Recognition},
	author={Sungjoo Byun and Jiseung Hong and Sumin Park and Dongjun Jang and Jean Seo and Minseok Kim and Chaeyoung Oh and Hyopil Shin},
	year={2024},
	eprint={2403.16158},
	archivePrefix={arXiv},
	primaryClass={cs.CL}
	}
	```

	## Model Card Contact

	For any questions or issues, please contact [email protected].