yonigo
/

deberta-v3-base-pii-en

Token Classification

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

yonigo commited on Jul 8, 2024

Commit

92cbe2d

·

verified ·

1 Parent(s): c415812

Update README.md

Files changed (1) hide show

README.md +15 -59

README.md CHANGED Viewed

@@ -11,6 +11,12 @@ metrics:
 model-index:
 - name: deberta-v3-base-pii-en
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -18,68 +24,18 @@ should probably proofread and complete it, then remove this comment. -->
 # deberta-v3-base-pii-en
-This model is a fine-tuned version of [microsoft/deberta-v3-base](https://huggingface.co/microsoft/deberta-v3-base) on an unknown dataset.
-It achieves the following results on the evaluation set:
-- Loss: 0.0767
-- Bod F1: 0.9705
-- Building F1: 0.9869
-- Cardissuer F1: 1.0
-- City F1: 0.9781
-- Country F1: 0.9773
-- Date F1: 0.9374
-- Driverlicense F1: 0.9645
-- Email F1: 0.9850
-- Geocoord F1: 0.9769
-- Givenname1 F1: 0.8810
-- Givenname2 F1: 0.7996
-- Idcard F1: 0.9443
-- Ip F1: 0.9873
-- Lastname1 F1: 0.8433
-- Lastname2 F1: 0.7641
-- Lastname3 F1: 0.7696
-- Pass F1: 0.9603
-- Passport F1: 0.9619
-- Postcode F1: 0.9820
-- Secaddress F1: 0.9791
-- Sex F1: 0.9782
-- Socialnumber F1: 0.9615
-- State F1: 0.9878
-- Street F1: 0.9815
-- Tel F1: 0.9767
-- Time F1: 0.9762
-- Title F1: 0.9668
-- Username F1: 0.9606
-- Precision: 0.9504
-- Recall: 0.9625
-- F1: 0.9564
-- Accuracy: 0.9904
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
-### Training hyperparameters
-The following hyperparameters were used during training:
-- learning_rate: 2e-05
-- train_batch_size: 32
-- eval_batch_size: 64
-- seed: 42
-- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
-- lr_scheduler_type: cosine
-- lr_scheduler_warmup_ratio: 0.2
-- lr_scheduler_warmup_steps: 3000
-- training_steps: 30000
 ### Training results
@@ -122,4 +78,4 @@ The following hyperparameters were used during training:
 - Transformers 4.41.2
 - Pytorch 2.3.1+cu121
 - Datasets 2.20.0
-- Tokenizers 0.19.1

 model-index:
 - name: deberta-v3-base-pii-en
   results: []
+pipeline_tag: token-classification
+widget:
+  - text: My name is Yoni Go and I live in Israel. My phone number is 054-1234567
+inference:
+  parameters:
+    aggregation_strategy: first
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 # deberta-v3-base-pii-en
+This model is a fine-tuned version of [microsoft/deberta-v3-base](https://huggingface.co/microsoft/deberta-v3-base) on English samples from [ai4privacy/pii-masking-300k](https://huggingface.co/datasets/ai4privacy/pii-masking-300k).
+Usage:
+```python
+from transformers import pipeline
+pipe = pipeline("token-classification", model="yonigo/deberta-v3-base-pii-en", aggregation_strategy="first")
+pipe("My name is Yoni Go and I live in Israel. My phone number is 054-1234567")
+```
+training code [git](https://github.com/yonigottesman/pii-model)
 ### Training results
 - Transformers 4.41.2
 - Pytorch 2.3.1+cu121
 - Datasets 2.20.0
+- Tokenizers 0.19.1