eliasalbouzidi
/

distilbert-nsfw-text-classifier

Text Classification

Inference Endpoints

Model card Files Files and versions Community

eliasalbouzidi commited on Jun 1

Commit

a3a4f0a

•

1 Parent(s): c6ac130

Update README.md

Files changed (1) hide show

README.md +23 -6

README.md CHANGED Viewed

@@ -12,7 +12,7 @@ widget:
   example_title: Nsfw
 - text: A mass shooting
   example_title: Nsfw
-base_model: distilbert-base-uncased
 license: apache-2.0
 language:
 - en
@@ -28,6 +28,24 @@ tags:
 - safety
 - innapropriate
 - distilbert
 ---
 # Model Card
@@ -53,6 +71,9 @@ The model can be used directly to classify text into one of the two classes. It
 - **Language(s) (NLP):** English
 - **License:** apache-2.0
 ### Training Data
 The training data for finetuning the text classification model consists of a large corpus of text labeled with one of the two classes: "safe" and "nsfw". The dataset contains a total of 190,000 examples, which are distributed as follows:
@@ -62,6 +83,7 @@ The training data for finetuning the text classification model consists of a lar
 It was assembled by scraping data from the web and utilizing existing open-source datasets. A significant portion of the dataset consists of descriptions for images and scenes. The primary objective was to prevent diffusers from generating NSFW content but it can be used for other moderation purposes.
 ### Training hyperparameters
 The following hyperparameters were used during training:
@@ -120,11 +142,6 @@ We selected the checkpoint with the highest F-beta1.6 score.
 - Tokenizers 0.19.1
-## Uses
-The model can be integrated into larger systems for content moderation or filtering.
 ### Out-of-Scope Use
 It should not be used for any illegal activities.

   example_title: Nsfw
 - text: A mass shooting
   example_title: Nsfw
+base_model: distilbert-base-uncased
 license: apache-2.0
 language:
 - en
 - safety
 - innapropriate
 - distilbert
+datasets:
+- eliasalbouzidi/NSFW-Safe-Dataset
+model-index:
+- name: NSFW-Safe-Dataset
+  results:
+  - task:
+      name: Text Classification
+      type: text-classification
+    dataset:
+      name: NSFW-Safe-Dataset
+      type: .
+    metrics:
+    - name: F1
+      type: f1
+      value: 0.974
+    - name: Accuracy
+      type: accuracy
+      value: 0.98
 ---
 # Model Card
 - **Language(s) (NLP):** English
 - **License:** apache-2.0
+### Uses
+The model can be integrated into larger systems for content moderation or filtering.
 ### Training Data
 The training data for finetuning the text classification model consists of a large corpus of text labeled with one of the two classes: "safe" and "nsfw". The dataset contains a total of 190,000 examples, which are distributed as follows:
 It was assembled by scraping data from the web and utilizing existing open-source datasets. A significant portion of the dataset consists of descriptions for images and scenes. The primary objective was to prevent diffusers from generating NSFW content but it can be used for other moderation purposes.
+You can access the dataset : https://huggingface.co/datasets/eliasalbouzidi/NSFW-Safe-Dataset
 ### Training hyperparameters
 The following hyperparameters were used during training:
 - Tokenizers 0.19.1
 ### Out-of-Scope Use
 It should not be used for any illegal activities.