File size: 2,558 Bytes
02cacac 72ae9a2 511fa64 5737cd8 511fa64 02cacac 81a2f72 853149d b2e9e3b c76626a 853149d 1eb1f55 853149d 6e8933c b2dc1b0 90d2663 1eb1f55 dd6fada 351e339 90d2663 1eb1f55 b2e9e3b 351e339 25022d1 853149d b990e47 d0a1aa1 351e339 25022d1 e969c33 d0a1aa1 e969c33 853149d 1eb1f55 b2e9e3b 351e339 853149d a28e1c8 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 |
---
language: en
widget:
- text: "It has been determined that the amount of greenhouse gases have decreased by almost half because of the prevalence in the utilization of nuclear power."
---
### Welcome to RoBERTArg!
π€ **Model description**
This model was trained on ~25k heterogeneous manually annotated sentences (π [Stab et al. 2018](https://www.aclweb.org/anthology/D18-1402/)) of controversial topics to classify text into one of two labels: π· **NON-ARGUMENT** (0) and **ARGUMENT** (1).
π **Dataset**
The dataset (π Stab et al. 2018) consists of **ARGUMENTS** (\~11k) that either support or oppose a topic if it includes a relevant reason for supporting or opposing the topic, or as a **NON-ARGUMENT** (\~14k) if it does not include reasons. The authors focus on controversial topics, i.e., topics that include "an obvious polarity to the possible outcomes" and compile a final set of eight controversial topics: _abortion, school uniforms, death penalty, marijuana legalization, nuclear energy, cloning, gun control, and minimum wage_.
| TOPIC | ARGUMENT | NON-ARGUMENT |
|----|----|----|
| abortion | 2213 | 2,427 |
| school uniforms | 325 | 1,734 |
| death penalty | 325 | 2,083 |
| marijuana legalization | 325 | 1,262 |
| nuclear energy | 325 | 2,118 |
| cloning | 325 | 1,494 |
| gun control | 325 | 1,889 |
| minimum wage | 325 | 1,346 |
ππΌββοΈ**Model training**
**RoBERTArg** was fine-tuned on a RoBERTA (base) pre-trained model from HuggingFace using the HuggingFace trainer with the following hyperparameters:
```
training_args = TrainingArguments(
num_train_epochs=2,
learning_rate=2.3102e-06,
seed=8,
per_device_train_batch_size=64,
per_device_eval_batch_size=64,
)
```
π **Evaluation**
The model was evaluated on an evaluation set (20%):
| Model | Acc | F1 | R arg | R non | P arg | P non |
|----|----|----|----|----|----|----|
| RoBERTArg | 0.8193 | 0.8021 | 0.8463 | 0.7986 | 0.7623 | 0.8719 |
Showing the **confusion matrix** using again the evaluation set:
| | ARGUMENT | NON-ARGUMENT |
|----|----|----|
| ARGUMENT | 2213 | 558 |
| NON-ARGUMENT | 325 | 1790 |
β οΈ **Intended Uses & Potential Limitations**
The model can only be a starting point to dive into the exciting field of argument mining. But be aware. An argument is a complex structure, with multiple dependencies. Therefore, the model may perform less well on different topics and text types not included in the training set.
Enjoy and stay tuned! π
π¦ Twitter: [@chklamm](http://twitter.com/chklamm) |