lxyuan commited on
Commit
247f7bf
·
1 Parent(s): cc6981a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +227 -12
README.md CHANGED
@@ -32,21 +32,19 @@ model-index:
32
  language:
33
  - en
34
  pipeline_tag: text-classification
 
 
 
35
  ---
36
 
37
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
38
  should probably proofread and complete it, then remove this comment. -->
39
 
40
- # Motivation
41
 
42
  Fine-tuning on the Reuters-21578 multilabel dataset is a valuable exercise, especially as it's frequently used in take-home tests during interviews. The dataset's complexity is just right for testing multilabel classification skills within a limited timeframe, while its real-world relevance helps simulate practical challenges. Experimenting with this dataset not only helps candidates prepare for interviews but also hones various skills including preprocessing, feature extraction, and model evaluation.
43
 
44
  This model is a fine-tuned version of [distilbert-base-cased](https://huggingface.co/distilbert-base-cased) on the reuters21578 dataset.
45
- It achieves the following results on the evaluation set:
46
- - Loss: 0.0110
47
- - F1: 0.8629
48
- - Roc Auc: 0.9063
49
- - Accuracy: 0.8196
50
 
51
  ## Inference Example
52
 
@@ -83,15 +81,14 @@ news_article = (
83
  target_topics = ['crude', 'nat-gas']
84
 
85
  fn_kwargs={"padding": "max_length", "truncation": True, "max_length": 512}
86
-
87
  output = pipe(example, function_to_apply="sigmoid", **fn_kwargs)
88
 
89
  for item in output[0]:
90
  if item["score"]>=0.5:
91
  print(item["label"], item["score"])
92
- >>>
93
- crude 0.7355073690414429
94
- nat-gas 0.8600426316261292
95
 
96
  ```
97
 
@@ -105,8 +102,7 @@ for item in output[0]:
105
  | Weighted Average F1 | 0.70 | 0.84 |
106
  | Samples Average F1 | 0.75 | 0.80 |
107
 
108
-
109
- **Precision vs Recall**: Both models prioritize high precision over recall. However, the transformer model achieves a higher micro-averaged F1-score, suggesting it may be better at maintaining a balance between precision and recall.
110
 
111
  **Class Imbalance Handling**: Both models suffer from the same general issue of not performing well on minority classes, as reflected in the low macro-averaged F1-scores. However, the transformer model shows a slight improvement, albeit marginal, in macro-averaged F1-score (0.33 vs 0.29).
112
 
@@ -166,6 +162,225 @@ This notebook establishes a baseline model for text classification on the Reuter
166
  [Reuters Transformer Model](https://github.com/LxYuan0420/nlp/blob/main/notebooks/transformer_reuters.ipynb):
167
  This notebook delves into advanced text classification using a Transformer model on the Reuters-21578 dataset. It covers the implementation details, training process, and performance metrics of using Transformer-based models for this specific task.
168
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
169
  ### Training hyperparameters
170
 
171
  The following hyperparameters were used during training:
 
32
  language:
33
  - en
34
  pipeline_tag: text-classification
35
+ widget:
36
+ - text: "JAPAN TO REVISE LONG-TERM ENERGY DEMAND DOWNWARDS The Ministry of International Trade and Industry (MITI) will revise its long-term energy supply/demand outlook by August to meet a forecast downtrend in Japanese energy demand, ministry officials said. MITI is expected to lower the projection for primary energy supplies in the year 2000 to 550 mln kilolitres (kl) from 600 mln, they said. The decision follows the emergence of structural changes in Japanese industry following the rise in the value of the yen and a decline in domestic electric power demand. MITI is planning to work out a revised energy supply/demand outlook through deliberations of committee meetings of the Agency of Natural Resources and Energy, the officials said. They said MITI will also review the breakdown of energy supply sources, including oil, nuclear, coal and natural gas. Nuclear energy provided the bulk of Japan's electric power in the fiscal year ended March 31, supplying an estimated 27 pct on a kilowatt/hour basis, followed by oil (23 pct) and liquefied natural gas (21 pct), they noted. REUTER"
37
+ example_title: "Example-1"
38
  ---
39
 
40
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
41
  should probably proofread and complete it, then remove this comment. -->
42
 
43
+ ## Motivation
44
 
45
  Fine-tuning on the Reuters-21578 multilabel dataset is a valuable exercise, especially as it's frequently used in take-home tests during interviews. The dataset's complexity is just right for testing multilabel classification skills within a limited timeframe, while its real-world relevance helps simulate practical challenges. Experimenting with this dataset not only helps candidates prepare for interviews but also hones various skills including preprocessing, feature extraction, and model evaluation.
46
 
47
  This model is a fine-tuned version of [distilbert-base-cased](https://huggingface.co/distilbert-base-cased) on the reuters21578 dataset.
 
 
 
 
 
48
 
49
  ## Inference Example
50
 
 
81
  target_topics = ['crude', 'nat-gas']
82
 
83
  fn_kwargs={"padding": "max_length", "truncation": True, "max_length": 512}
 
84
  output = pipe(example, function_to_apply="sigmoid", **fn_kwargs)
85
 
86
  for item in output[0]:
87
  if item["score"]>=0.5:
88
  print(item["label"], item["score"])
89
+
90
+ >>> crude 0.7355073690414429
91
+ nat-gas 0.8600426316261292
92
 
93
  ```
94
 
 
102
  | Weighted Average F1 | 0.70 | 0.84 |
103
  | Samples Average F1 | 0.75 | 0.80 |
104
 
105
+ **Precision vs Recall**: Both models prioritize high precision over recall. In our client-facing news classification model, precision takes precedence over recall. This is because the repercussions of false positives are more severe and harder to justify to clients compared to false negatives. When the model incorrectly tags a news item with a topic, it's challenging to explain this error. On the other hand, if the model misses a topic, it's easier to defend by stating that the topic wasn't sufficiently emphasized in the news article.
 
106
 
107
  **Class Imbalance Handling**: Both models suffer from the same general issue of not performing well on minority classes, as reflected in the low macro-averaged F1-scores. However, the transformer model shows a slight improvement, albeit marginal, in macro-averaged F1-score (0.33 vs 0.29).
108
 
 
162
  [Reuters Transformer Model](https://github.com/LxYuan0420/nlp/blob/main/notebooks/transformer_reuters.ipynb):
163
  This notebook delves into advanced text classification using a Transformer model on the Reuters-21578 dataset. It covers the implementation details, training process, and performance metrics of using Transformer-based models for this specific task.
164
 
165
+ ## Evaluation results
166
+ <details>
167
+ <summary>Transformer Model Evaluation Result</summary>
168
+
169
+ Classification Report:
170
+ precision recall f1-score support
171
+
172
+ acq 0.97 0.93 0.95 719
173
+ alum 1.00 0.70 0.82 23
174
+ austdlr 0.00 0.00 0.00 0
175
+ barley 1.00 0.50 0.67 12
176
+ bop 0.79 0.50 0.61 30
177
+ can 0.00 0.00 0.00 0
178
+ carcass 0.67 0.67 0.67 18
179
+ cocoa 1.00 1.00 1.00 18
180
+ coconut 0.00 0.00 0.00 2
181
+ coconut-oil 0.00 0.00 0.00 2
182
+ coffee 0.86 0.89 0.87 27
183
+ copper 1.00 0.78 0.88 18
184
+ copra-cake 0.00 0.00 0.00 1
185
+ corn 0.84 0.87 0.86 55
186
+ cornglutenfeed 0.00 0.00 0.00 0
187
+ cotton 0.92 0.67 0.77 18
188
+ cpi 0.86 0.43 0.57 28
189
+ cpu 0.00 0.00 0.00 1
190
+ crude 0.87 0.93 0.90 189
191
+ dfl 0.00 0.00 0.00 1
192
+ dlr 0.72 0.64 0.67 44
193
+ dmk 0.00 0.00 0.00 4
194
+ earn 0.98 0.99 0.98 1087
195
+ fishmeal 0.00 0.00 0.00 0
196
+ fuel 0.00 0.00 0.00 10
197
+ gas 0.80 0.71 0.75 17
198
+ gnp 0.79 0.66 0.72 35
199
+ gold 0.95 0.67 0.78 30
200
+ grain 0.94 0.92 0.93 146
201
+ groundnut 0.00 0.00 0.00 4
202
+ heat 0.00 0.00 0.00 5
203
+ hog 1.00 0.33 0.50 6
204
+ housing 0.00 0.00 0.00 4
205
+ income 0.00 0.00 0.00 7
206
+ instal-debt 0.00 0.00 0.00 1
207
+ interest 0.89 0.67 0.77 131
208
+ inventories 0.00 0.00 0.00 0
209
+ ipi 1.00 0.58 0.74 12
210
+ iron-steel 0.90 0.64 0.75 14
211
+ jet 0.00 0.00 0.00 1
212
+ jobs 0.92 0.57 0.71 21
213
+ l-cattle 0.00 0.00 0.00 2
214
+ lead 0.00 0.00 0.00 14
215
+ lei 0.00 0.00 0.00 3
216
+ linseed 0.00 0.00 0.00 0
217
+ livestock 0.63 0.79 0.70 24
218
+ lumber 0.00 0.00 0.00 6
219
+ meal-feed 0.00 0.00 0.00 17
220
+ money-fx 0.78 0.81 0.80 177
221
+ money-supply 0.80 0.71 0.75 34
222
+ naphtha 0.00 0.00 0.00 4
223
+ nat-gas 0.82 0.60 0.69 30
224
+ nickel 0.00 0.00 0.00 1
225
+ nzdlr 0.00 0.00 0.00 2
226
+ oat 0.00 0.00 0.00 4
227
+ oilseed 0.64 0.61 0.63 44
228
+ orange 1.00 0.36 0.53 11
229
+ palladium 0.00 0.00 0.00 1
230
+ palm-oil 1.00 0.56 0.71 9
231
+ palmkernel 0.00 0.00 0.00 1
232
+ pet-chem 0.00 0.00 0.00 12
233
+ platinum 0.00 0.00 0.00 7
234
+ plywood 0.00 0.00 0.00 0
235
+ pork-belly 0.00 0.00 0.00 0
236
+ potato 0.00 0.00 0.00 3
237
+ propane 0.00 0.00 0.00 3
238
+ rand 0.00 0.00 0.00 1
239
+ rape-oil 0.00 0.00 0.00 1
240
+ rapeseed 0.00 0.00 0.00 8
241
+ reserves 0.83 0.56 0.67 18
242
+ retail 0.00 0.00 0.00 2
243
+ rice 1.00 0.57 0.72 23
244
+ rubber 0.82 0.75 0.78 12
245
+ saudriyal 0.00 0.00 0.00 0
246
+ ship 0.95 0.81 0.87 89
247
+ silver 1.00 0.12 0.22 8
248
+ sorghum 1.00 0.12 0.22 8
249
+ soy-meal 0.00 0.00 0.00 12
250
+ soy-oil 0.00 0.00 0.00 8
251
+ soybean 0.72 0.56 0.63 32
252
+ stg 0.00 0.00 0.00 0
253
+ strategic-metal 0.00 0.00 0.00 11
254
+ sugar 1.00 0.80 0.89 35
255
+ sun-oil 0.00 0.00 0.00 0
256
+ sunseed 0.00 0.00 0.00 5
257
+ tapioca 0.00 0.00 0.00 0
258
+ tea 0.00 0.00 0.00 3
259
+ tin 1.00 0.42 0.59 12
260
+ trade 0.78 0.79 0.79 116
261
+ veg-oil 0.91 0.59 0.71 34
262
+ wheat 0.83 0.83 0.83 69
263
+ wool 0.00 0.00 0.00 0
264
+ wpi 0.00 0.00 0.00 10
265
+ yen 0.57 0.29 0.38 14
266
+ zinc 1.00 0.69 0.82 13
267
+
268
+ micro avg 0.92 0.81 0.86 3694
269
+ macro avg 0.41 0.30 0.33 3694
270
+ weighted avg 0.87 0.81 0.84 3694
271
+ samples avg 0.81 0.80 0.80 3694
272
+
273
+ </details>
274
+
275
+
276
+ <details>
277
+ <summary>Scikit-learn Baseline Model Evaluation Result</summary>
278
+ Classification Report:
279
+ precision recall f1-score support
280
+
281
+ acq 0.98 0.87 0.92 719
282
+ alum 1.00 0.00 0.00 23
283
+ austdlr 1.00 1.00 1.00 0
284
+ barley 1.00 0.00 0.00 12
285
+ bop 1.00 0.30 0.46 30
286
+ can 1.00 1.00 1.00 0
287
+ carcass 1.00 0.06 0.11 18
288
+ cocoa 1.00 0.61 0.76 18
289
+ coconut 1.00 0.00 0.00 2
290
+ coconut-oil 1.00 0.00 0.00 2
291
+ coffee 0.94 0.59 0.73 27
292
+ copper 1.00 0.22 0.36 18
293
+ copra-cake 1.00 0.00 0.00 1
294
+ corn 0.97 0.51 0.67 55
295
+ cornglutenfeed 1.00 1.00 1.00 0
296
+ cotton 1.00 0.06 0.11 18
297
+ cpi 1.00 0.14 0.25 28
298
+ cpu 1.00 0.00 0.00 1
299
+ crude 0.94 0.69 0.80 189
300
+ dfl 1.00 0.00 0.00 1
301
+ dlr 0.86 0.43 0.58 44
302
+ dmk 1.00 0.00 0.00 4
303
+ earn 0.99 0.97 0.98 1087
304
+ fishmeal 1.00 1.00 1.00 0
305
+ fuel 1.00 0.00 0.00 10
306
+ gas 1.00 0.00 0.00 17
307
+ gnp 1.00 0.31 0.48 35
308
+ gold 0.83 0.17 0.28 30
309
+ grain 1.00 0.65 0.79 146
310
+ groundnut 1.00 0.00 0.00 4
311
+ heat 1.00 0.00 0.00 5
312
+ hog 1.00 0.00 0.00 6
313
+ housing 1.00 0.00 0.00 4
314
+ income 1.00 0.00 0.00 7
315
+ instal-debt 1.00 0.00 0.00 1
316
+ interest 0.88 0.40 0.55 131
317
+ inventories 1.00 1.00 1.00 0
318
+ ipi 1.00 0.00 0.00 12
319
+ iron-steel 1.00 0.00 0.00 14
320
+ jet 1.00 0.00 0.00 1
321
+ jobs 1.00 0.14 0.25 21
322
+ l-cattle 1.00 0.00 0.00 2
323
+ lead 1.00 0.00 0.00 14
324
+ lei 1.00 0.00 0.00 3
325
+ linseed 1.00 1.00 1.00 0
326
+ livestock 0.67 0.08 0.15 24
327
+ lumber 1.00 0.00 0.00 6
328
+ meal-feed 1.00 0.00 0.00 17
329
+ money-fx 0.80 0.50 0.62 177
330
+ money-supply 0.88 0.41 0.56 34
331
+ naphtha 1.00 0.00 0.00 4
332
+ nat-gas 1.00 0.27 0.42 30
333
+ nickel 1.00 0.00 0.00 1
334
+ nzdlr 1.00 0.00 0.00 2
335
+ oat 1.00 0.00 0.00 4
336
+ oilseed 0.62 0.11 0.19 44
337
+ orange 1.00 0.00 0.00 11
338
+ palladium 1.00 0.00 0.00 1
339
+ palm-oil 1.00 0.22 0.36 9
340
+ palmkernel 1.00 0.00 0.00 1
341
+ pet-chem 1.00 0.00 0.00 12
342
+ platinum 1.00 0.00 0.00 7
343
+ plywood 1.00 1.00 1.00 0
344
+ pork-belly 1.00 1.00 1.00 0
345
+ potato 1.00 0.00 0.00 3
346
+ propane 1.00 0.00 0.00 3
347
+ rand 1.00 0.00 0.00 1
348
+ rape-oil 1.00 0.00 0.00 1
349
+ rapeseed 1.00 0.00 0.00 8
350
+ reserves 1.00 0.00 0.00 18
351
+ retail 1.00 0.00 0.00 2
352
+ rice 1.00 0.00 0.00 23
353
+ rubber 1.00 0.17 0.29 12
354
+ saudriyal 1.00 1.00 1.00 0
355
+ ship 0.92 0.26 0.40 89
356
+ silver 1.00 0.00 0.00 8
357
+ sorghum 1.00 0.00 0.00 8
358
+ soy-meal 1.00 0.00 0.00 12
359
+ soy-oil 1.00 0.00 0.00 8
360
+ soybean 1.00 0.16 0.27 32
361
+ stg 1.00 1.00 1.00 0
362
+ strategic-metal 1.00 0.00 0.00 11
363
+ sugar 1.00 0.60 0.75 35
364
+ sun-oil 1.00 1.00 1.00 0
365
+ sunseed 1.00 0.00 0.00 5
366
+ tapioca 1.00 1.00 1.00 0
367
+ tea 1.00 0.00 0.00 3
368
+ tin 1.00 0.00 0.00 12
369
+ trade 0.92 0.61 0.74 116
370
+ veg-oil 1.00 0.12 0.21 34
371
+ wheat 0.97 0.55 0.70 69
372
+ wool 1.00 1.00 1.00 0
373
+ wpi 1.00 0.00 0.00 10
374
+ yen 1.00 0.00 0.00 14
375
+ zinc 1.00 0.00 0.00 13
376
+
377
+ micro avg 0.97 0.64 0.77 3694
378
+ macro avg 0.98 0.25 0.29 3694
379
+ weighted avg 0.96 0.64 0.70 3694
380
+ samples avg 0.98 0.74 0.75 3694
381
+ </details>
382
+
383
+
384
  ### Training hyperparameters
385
 
386
  The following hyperparameters were used during training: