amichelini commited on
Commit
77555a4
โ€ข
1 Parent(s): 06d99f8

feat: updated readme

Browse files
Files changed (1) hide show
  1. README.md +148 -3
README.md CHANGED
@@ -1,3 +1,148 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - sentiment-analysis
5
+ - text-classification
6
+ - zero-shot-distillation
7
+ - distillation
8
+ - zero-shot-classification
9
+ - debarta-v3
10
+ model-index:
11
+ - name: distilbert-base-multilingual-cased-sentiments-student
12
+ results: []
13
+ datasets:
14
+ - tyqiangz/multilingual-sentiments
15
+ language:
16
+ - en
17
+ - ar
18
+ - de
19
+ - es
20
+ - fr
21
+ - ja
22
+ - zh
23
+ - id
24
+ - hi
25
+ - it
26
+ - ms
27
+ - pt
28
+ ---
29
+
30
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
31
+ should probably proofread and complete it, then remove this comment. -->
32
+
33
+ # distilbert-base-multilingual-cased-sentiments-student
34
+
35
+ > **Note**
36
+ >
37
+ > This is a fork of the `distilbert-base-multilingual-cased-sentiments-student` model. The original model card can be found [here](https://huggingface.co/lxyuan/distilbert-base-multilingual-cased-sentiments-student).
38
+ > This is just a conversion of the model to the ONNX format so it can be used in JavaScript/TypeScript applications.
39
+
40
+ This model is distilled from the zero-shot classification pipeline on the Multilingual Sentiment
41
+ dataset using this [script](https://github.com/huggingface/transformers/tree/main/examples/research_projects/zero-shot-distillation).
42
+
43
+ In reality the multilingual-sentiment dataset is annotated of course,
44
+ but we'll pretend and ignore the annotations for the sake of example.
45
+
46
+
47
+ Teacher model: MoritzLaurer/mDeBERTa-v3-base-mnli-xnli
48
+ Teacher hypothesis template: "The sentiment of this text is {}."
49
+ Student model: distilbert-base-multilingual-cased
50
+
51
+
52
+ ## Inference example
53
+
54
+ ```python
55
+ from transformers import pipeline
56
+
57
+ distilled_student_sentiment_classifier = pipeline(
58
+ model="lxyuan/distilbert-base-multilingual-cased-sentiments-student",
59
+ return_all_scores=True
60
+ )
61
+
62
+ # english
63
+ distilled_student_sentiment_classifier ("I love this movie and i would watch it again and again!")
64
+ >> [[{'label': 'positive', 'score': 0.9731044769287109},
65
+ {'label': 'neutral', 'score': 0.016910076141357422},
66
+ {'label': 'negative', 'score': 0.009985478594899178}]]
67
+
68
+ # malay
69
+ distilled_student_sentiment_classifier("Saya suka filem ini dan saya akan menontonnya lagi dan lagi!")
70
+ [[{'label': 'positive', 'score': 0.9760093688964844},
71
+ {'label': 'neutral', 'score': 0.01804516464471817},
72
+ {'label': 'negative', 'score': 0.005945465061813593}]]
73
+
74
+ # japanese
75
+ distilled_student_sentiment_classifier("็งใฏใ“ใฎๆ˜ ็”ปใŒๅคงๅฅฝใใงใ€ไฝ•ๅบฆใ‚‚่ฆ‹ใพใ™๏ผ")
76
+ >> [[{'label': 'positive', 'score': 0.9342429041862488},
77
+ {'label': 'neutral', 'score': 0.040193185210227966},
78
+ {'label': 'negative', 'score': 0.025563929229974747}]]
79
+
80
+
81
+ ```
82
+
83
+ ## Training procedure
84
+
85
+ Notebook link: [here](https://github.com/LxYuan0420/nlp/blob/main/notebooks/Distilling_Zero_Shot_multilingual_distilbert_sentiments_student.ipynb)
86
+
87
+ ### Training hyperparameters
88
+
89
+ Result can be reproduce using the following commands:
90
+
91
+ ```bash
92
+ python transformers/examples/research_projects/zero-shot-distillation/distill_classifier.py \
93
+ --data_file ./multilingual-sentiments/train_unlabeled.txt \
94
+ --class_names_file ./multilingual-sentiments/class_names.txt \
95
+ --hypothesis_template "The sentiment of this text is {}." \
96
+ --teacher_name_or_path MoritzLaurer/mDeBERTa-v3-base-mnli-xnli \
97
+ --teacher_batch_size 32 \
98
+ --student_name_or_path distilbert-base-multilingual-cased \
99
+ --output_dir ./distilbert-base-multilingual-cased-sentiments-student \
100
+ --per_device_train_batch_size 16 \
101
+ --fp16
102
+ ```
103
+
104
+ If you are training this model on Colab, make the following code changes to avoid Out-of-memory error message:
105
+ ```bash
106
+ ###### modify L78 to disable fast tokenizer
107
+ default=False,
108
+
109
+ ###### update dataset map part at L313
110
+ dataset = dataset.map(tokenizer, input_columns="text", fn_kwargs={"padding": "max_length", "truncation": True, "max_length": 512})
111
+
112
+ ###### add following lines to L213
113
+ del model
114
+ print(f"Manually deleted Teacher model, free some memory for student model.")
115
+
116
+ ###### add following lines to L337
117
+ trainer.push_to_hub()
118
+ tokenizer.push_to_hub("distilbert-base-multilingual-cased-sentiments-student")
119
+
120
+ ```
121
+
122
+ ### Training log
123
+ ```bash
124
+
125
+ Training completed. Do not forget to share your model on huggingface.co/models =)
126
+
127
+ {'train_runtime': 2009.8864, 'train_samples_per_second': 73.0, 'train_steps_per_second': 4.563, 'train_loss': 0.6473459283913797, 'epoch': 1.0}
128
+ 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 9171/9171 [33:29<00:00, 4.56it/s]
129
+ [INFO|trainer.py:762] 2023-05-06 10:56:18,555 >> The following columns in the evaluation set don't have a corresponding argument in `DistilBertForSequenceClassification.forward` and have been ignored: text. If text are not expected by `DistilBertForSequenceClassification.forward`, you can safely ignore this message.
130
+ [INFO|trainer.py:3129] 2023-05-06 10:56:18,557 >> ***** Running Evaluation *****
131
+ [INFO|trainer.py:3131] 2023-05-06 10:56:18,557 >> Num examples = 146721
132
+ [INFO|trainer.py:3134] 2023-05-06 10:56:18,557 >> Batch size = 128
133
+ 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 1147/1147 [08:59<00:00, 2.13it/s]
134
+ 05/06/2023 11:05:18 - INFO - __main__ - Agreement of student and teacher predictions: 88.29%
135
+ [INFO|trainer.py:2868] 2023-05-06 11:05:18,251 >> Saving model checkpoint to ./distilbert-base-multilingual-cased-sentiments-student
136
+ [INFO|configuration_utils.py:457] 2023-05-06 11:05:18,251 >> Configuration saved in ./distilbert-base-multilingual-cased-sentiments-student/config.json
137
+ [INFO|modeling_utils.py:1847] 2023-05-06 11:05:18,905 >> Model weights saved in ./distilbert-base-multilingual-cased-sentiments-student/pytorch_model.bin
138
+ [INFO|tokenization_utils_base.py:2171] 2023-05-06 11:05:18,905 >> tokenizer config file saved in ./distilbert-base-multilingual-cased-sentiments-student/tokenizer_config.json
139
+ [INFO|tokenization_utils_base.py:2178] 2023-05-06 11:05:18,905 >> Special tokens file saved in ./distilbert-base-multilingual-cased-sentiments-student/special_tokens_map.json
140
+
141
+ ```
142
+
143
+ ### Framework versions
144
+
145
+ - Transformers 4.28.1
146
+ - Pytorch 2.0.0+cu118
147
+ - Datasets 2.11.0
148
+ - Tokenizers 0.13.3