arianpasquali commited on
Commit
a999ecb
·
1 Parent(s): b9f9177

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +59 -0
README.md ADDED
@@ -0,0 +1,59 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ pipeline_type: "text-classification"
3
+
4
+ widget:
5
+ - text: "this is a lovely message"
6
+ example_title: "Example 1"
7
+ multi_class: false
8
+ - text: "you are an idiot and you and your family should go back to your country"
9
+ example_title: "Example 2"
10
+ multi_class: false
11
+
12
+
13
+ language:
14
+ - en
15
+ - nl
16
+ - fr
17
+ - pt
18
+ - it
19
+ - es
20
+ - de
21
+ - da
22
+ - pl
23
+ - af
24
+
25
+ datasets:
26
+ - jigsaw_toxicity_pred
27
+ metrics:
28
+ - F1 Accuracy
29
+ ---
30
+
31
+ # citizenlab/distilbert-base-multilingual-cased-toxicity
32
+
33
+ This is multilingual Distil-Bert model sequence classifier trained based on [JIGSAW Toxic Comment Classification Challenge](https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge) dataset.
34
+
35
+ ## How to use it
36
+
37
+ ```python
38
+ from transformers import pipeline
39
+
40
+ model_path = "citizenlab/distilbert-base-multilingual-cased-toxicity"
41
+
42
+ topic_classifier = pipeline("text-classification", model=model_path, tokenizer=model_path)
43
+ topic_classifier("this is a lovely message")
44
+ > [{'label': 'not_toxic', 'score': 0.9954179525375366}]
45
+
46
+ topic_classifier("you are an idiot and you and your family should go back to your country")
47
+ > [{'label': 'toxic', 'score': 0.9948776960372925}]
48
+
49
+ ```
50
+
51
+ ## Evaluation
52
+
53
+ ### Accuracy
54
+
55
+ ```
56
+ Accuracy Score = 0.9425
57
+ F1 Score (Micro) = 0.9450549450549449
58
+ F1 Score (Macro) = 0.8491432341169309
59
+ ```