i-be-snek's picture
Update README.md
727fa43
---
language:
- en
license: apache-2.0
tags:
- generated_from_keras_callback
datasets:
- Babelscape/multinerd
metrics:
- seqeval
base_model: distilbert-base-uncased
pipeline_tag: token-classification
widget:
- text: After months of meticulous review and analysis, I am proud to present a study
that explores the deep connections between Epstein-Barr virus (EBV), Long COVID
and Myalgic Encephalomyelitis.
example_title: Example 1
- text: The boy is, of course, Cupid. The image of a cupid riding a lion was a common
theme in classical and Renaissance art, representing the Virgilian maxim Amor
vincit omnia love conquers all.
example_title: Example 2
- text: Billionaire Charlie Munger, Warren Buffet's right hand man, dies at 99.
example_title: Example 3
model-index:
- name: i-be-snek/distilbert-base-uncased-finetuned-ner-exp_A
results:
- task:
type: token-classification
name: ner
dataset:
name: Babelscape/multinerd
type: Babelscape/multinerd
split: test
metrics:
- type: seqeval
value: 0.9053582270795385
name: precision
- type: seqeval
value: 0.9303178007408852
name: recall
- type: seqeval
value: 0.9176683270188665
name: f1
- type: seqeval
value: 0.9863554498955407
name: accuracy
---
<!-- This model card has been generated automatically according to the information Keras had access to. You should
probably proofread and complete it, then remove this comment. -->
# i-be-snek/distilbert-base-uncased-finetuned-ner-exp_A
This model is a fine-tuned version of [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) on the English subset of all named entities in [Babelscape/multinerd](https://huggingface.co/datasets/Babelscape/multinerd) dataset.
It achieves the following results on the validation set:
- Train Loss: 0.0163
- Validation Loss: 0.1024
- Train Precision: 0.8763
- Train Recall: 0.8862
- Train F1: 0.8812
- Train Accuracy: 0.9750
- Epoch: 2
## Model description
[distilbert-base-uncased-finetuned-ner-exp_A](https://huggingface.co/i-be-snek/distilbert-base-uncased-finetuned-ner-exp_B) is a Named Entity Recognition model finetuned on [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased).
This model is uncased, so it makes no distinction between "sarah" and "Sarah".
## Training and evaluation data
This model has been evaluated on the English subset of the test set of [Babelscape/multinerd](https://huggingface.co/datasets/Babelscape/multinerd)
### Evaluation results
| metric | value |
|:----------|---------:|
| precision | 0.905358 |
| recall | 0.930318 |
| f1 | 0.917668 |
| accuracy | 0.986355 |
|metric/tag | ANIM | BIO | CEL | DIS | EVE | FOOD | INST | LOC | MEDIA | MYTH | ORG | PER | PLANT | TIME | VEHI |
|:----------|------------:|----------:|----------:|------------:|-----------:|------------:|----------:|-------------:|-----------:|----------:|------------:|-------------:|------------:|-----------:|----------:|
| precision | 0.667262 | 0.666667 | 0.508197 | 0.662324 | 0.896277 | 0.637809 | 0.642857 | 0.964137 | 0.931915 | 0.638889 | 0.941176 | 0.99033 | 0.558043 | 0.756579 | 0.735294 |
| recall | 0.698878 | 0.75 | 0.756098 | 0.803689 | 0.957386 | 0.637809 | 0.75 | 0.963656 | 0.956332 | 0.71875 | 0.962224 | 0.992023 | 0.752796 | 0.795848 | 0.78125 |
| f1 | 0.682704 | 0.705882 | 0.607843 | 0.72619 | 0.925824 | 0.637809 | 0.692308 | 0.963897 | 0.943966 | 0.676471 | 0.951584 | 0.991176 | 0.640952 | 0.775717 | 0.757576 |
| number | 3208 | 16 | 82 | 1518 | 704 | 1132 | 24 | 24048 | 916 | 64 | 6618 | 10530 | 1788 | 578 | 64 |
## Training procedure
All scripts for training can be found in this [GitHub repository](https://github.com/i-be-snek/rise-assignment-ner-finetune).
The model had early stopped watching its `val_loss`.
### Training hyperparameters
The following hyperparameters were used during training:
- optimizer:
```python
{
"name": "AdamWeightDecay",
"learning_rate": 2e-05,
"decay": 0.0,
"beta_1": 0.9,
"beta_2": 0.999,
"epsilon": 1e-07,
"amsgrad": False,
"weight_decay_rate": 0.0,
}
```
- training_precision: `float32`
### Training results
| Train Loss | Validation Loss | Train Precision | Train Recall | Train F1 | Train Accuracy | Epoch |
|:----------:|:---------------:|:---------------:|:------------:|:--------:|:--------------:|:-----:|
| 0.0709 | 0.0710 | 0.8563 | 0.8875 | 0.8716 | 0.9735 | 0 |
| 0.0295 | 0.0851 | 0.8743 | 0.8835 | 0.8789 | 0.9748 | 1 |
| 0.0163 | 0.1024 | 0.8763 | 0.8862 | 0.8812 | 0.9750 | 2 |
Epoch 0
| Named Entity | precision | recall | f1 |
|:----------:|:---------:|:---------:|:------:|
| ANIM | 0.699150 | 0.620124 | 0.657270 |
| BIO | 0.480000 | 0.782609 | 0.595041 |
| CEL | 0.815385 | 0.876033 | 0.844622 |
| DIS | 0.628939 | 0.806709 | 0.706818 |
| EVE | 0.898876 | 0.924855 | 0.911681 |
| FOOD | 0.624774 | 0.602266 | 0.613314 |
| INST | 0.467391 | 0.741379 | 0.573333 |
| LOC | 0.967354 | 0.969634 | 0.968493 |
| MEDIA | 0.911227 | 0.939856 | 0.925320 |
| MYTH | 0.941860 | 0.771429 | 0.848168 |
| ORG | 0.924471 | 0.937629 | 0.931003 |
| PER | 0.988699 | 0.990918 | 0.989807 |
| PLANT | 0.622521 | 0.781333 | 0.692944 |
| TIME | 0.743902 | 0.738499 | 0.741191 |
| VEHI | 0.785714 | 0.791367 | 0.788530 |
Epoch 1
| Named Entity | precision | recall | f1 |
|:----------:|:---------:|:---------:|:--------:|
| ANIM | 0.701040 | 0.747340 | 0.723450 |
| BIO | 0.422222 | 0.826087 | 0.558824 |
| CEL | 0.729167 | 0.867769 | 0.792453 |
| DIS | 0.731099 | 0.749794 | 0.740328 |
| EVE | 0.864865 | 0.924855 | 0.893855 |
| FOOD | 0.652865 | 0.572632 | 0.610122 |
| INST | 0.871795 | 0.586207 | 0.701031 |
| LOC | 0.968255 | 0.966143 | 0.967198 |
| MEDIA | 0.946346 | 0.918312 | 0.932118 |
| MYTH | 0.914894 | 0.819048 | 0.864322 |
| ORG | 0.906064 | 0.943582 | 0.924442 |
| PER | 0.990389 | 0.988367 | 0.989377 |
| PLANT | 0.625889 | 0.743556 | 0.679667 |
| TIME | 0.755981 | 0.765133 | 0.760529 |
| VEHI | 0.737500 | 0.848921 | 0.789298 |
Epoch 2
| Named Entity | precision | recall | f1 |
|:----------:|:---------:|:---------:|:--------:|
| ANIM | 0.730443 | 0.687057 | 0.708086 |
| BIO | 0.330882 | 0.978261 | 0.494505 |
| CEL | 0.798561 | 0.917355 | 0.853846 |
| DIS | 0.738108 | 0.750894 | 0.744446 |
| EVE | 0.904899 | 0.907514 | 0.906205 |
| FOOD | 0.628664 | 0.623184 | 0.625912 |
| INST | 0.533333 | 0.551724 | 0.542373 |
| LOC | 0.967915 | 0.973997 | 0.970946 |
| MEDIA | 0.949627 | 0.913824 | 0.931382 |
| MYTH | 0.910000 | 0.866667 | 0.887805 |
| ORG | 0.924920 | 0.934136 | 0.929505 |
| PER | 0.989506 | 0.991020 | 0.990263 |
| PLANT | 0.637648 | 0.742222 | 0.685972 |
| TIME | 0.766355 | 0.794189 | 0.780024 |
| VEHI | 0.818182 | 0.647482 | 0.722892 |
### Framework versions
- Transformers 4.35.2
- TensorFlow 2.14.0
- Datasets 2.15.0
- Tokenizers 0.15.0