yonigo commited on
Commit
92cbe2d
·
verified ·
1 Parent(s): c415812

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -59
README.md CHANGED
@@ -11,6 +11,12 @@ metrics:
11
  model-index:
12
  - name: deberta-v3-base-pii-en
13
  results: []
 
 
 
 
 
 
14
  ---
15
 
16
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -18,68 +24,18 @@ should probably proofread and complete it, then remove this comment. -->
18
 
19
  # deberta-v3-base-pii-en
20
 
21
- This model is a fine-tuned version of [microsoft/deberta-v3-base](https://huggingface.co/microsoft/deberta-v3-base) on an unknown dataset.
22
- It achieves the following results on the evaluation set:
23
- - Loss: 0.0767
24
- - Bod F1: 0.9705
25
- - Building F1: 0.9869
26
- - Cardissuer F1: 1.0
27
- - City F1: 0.9781
28
- - Country F1: 0.9773
29
- - Date F1: 0.9374
30
- - Driverlicense F1: 0.9645
31
- - Email F1: 0.9850
32
- - Geocoord F1: 0.9769
33
- - Givenname1 F1: 0.8810
34
- - Givenname2 F1: 0.7996
35
- - Idcard F1: 0.9443
36
- - Ip F1: 0.9873
37
- - Lastname1 F1: 0.8433
38
- - Lastname2 F1: 0.7641
39
- - Lastname3 F1: 0.7696
40
- - Pass F1: 0.9603
41
- - Passport F1: 0.9619
42
- - Postcode F1: 0.9820
43
- - Secaddress F1: 0.9791
44
- - Sex F1: 0.9782
45
- - Socialnumber F1: 0.9615
46
- - State F1: 0.9878
47
- - Street F1: 0.9815
48
- - Tel F1: 0.9767
49
- - Time F1: 0.9762
50
- - Title F1: 0.9668
51
- - Username F1: 0.9606
52
- - Precision: 0.9504
53
- - Recall: 0.9625
54
- - F1: 0.9564
55
- - Accuracy: 0.9904
56
 
57
- ## Model description
 
 
58
 
59
- More information needed
 
 
60
 
61
- ## Intended uses & limitations
62
 
63
- More information needed
64
-
65
- ## Training and evaluation data
66
-
67
- More information needed
68
-
69
- ## Training procedure
70
-
71
- ### Training hyperparameters
72
-
73
- The following hyperparameters were used during training:
74
- - learning_rate: 2e-05
75
- - train_batch_size: 32
76
- - eval_batch_size: 64
77
- - seed: 42
78
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
79
- - lr_scheduler_type: cosine
80
- - lr_scheduler_warmup_ratio: 0.2
81
- - lr_scheduler_warmup_steps: 3000
82
- - training_steps: 30000
83
 
84
  ### Training results
85
 
@@ -122,4 +78,4 @@ The following hyperparameters were used during training:
122
  - Transformers 4.41.2
123
  - Pytorch 2.3.1+cu121
124
  - Datasets 2.20.0
125
- - Tokenizers 0.19.1
 
11
  model-index:
12
  - name: deberta-v3-base-pii-en
13
  results: []
14
+ pipeline_tag: token-classification
15
+ widget:
16
+ - text: My name is Yoni Go and I live in Israel. My phone number is 054-1234567
17
+ inference:
18
+ parameters:
19
+ aggregation_strategy: first
20
  ---
21
 
22
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
24
 
25
  # deberta-v3-base-pii-en
26
 
27
+ This model is a fine-tuned version of [microsoft/deberta-v3-base](https://huggingface.co/microsoft/deberta-v3-base) on English samples from [ai4privacy/pii-masking-300k](https://huggingface.co/datasets/ai4privacy/pii-masking-300k).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
28
 
29
+ Usage:
30
+ ```python
31
+ from transformers import pipeline
32
 
33
+ pipe = pipeline("token-classification", model="yonigo/deberta-v3-base-pii-en", aggregation_strategy="first")
34
+ pipe("My name is Yoni Go and I live in Israel. My phone number is 054-1234567")
35
+ ```
36
 
37
+ training code [git](https://github.com/yonigottesman/pii-model)
38
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
39
 
40
  ### Training results
41
 
 
78
  - Transformers 4.41.2
79
  - Pytorch 2.3.1+cu121
80
  - Datasets 2.20.0
81
+ - Tokenizers 0.19.1