|
--- |
|
language: |
|
- en |
|
metrics: |
|
- f1 |
|
pipeline_tag: text-classification |
|
tags: |
|
- classification |
|
- framing |
|
- MediaFrames |
|
- argument classification |
|
- multilabel |
|
- RoBERTa-base |
|
--- |
|
|
|
# Model for predicting MediaFrames on arguments |
|
|
|
A model for predicting a subset of MediaFrames given an argument (has not to be structured in premise/ conclusion or something else). To investigate the generic frame classes, have a look at [The Media Frames Corpus: Annotations of Frames Across Issues](https://aclanthology.org/P15-2072/) |
|
|
|
Also, this model was fine-tuned on the data provided by [this paper](https://aclanthology.org/P15-2072/). To be precise, we did the following: |
|
|
|
> To apply these frames to arguments from DDO, we fine-tune a range of classifiers on a comprehensive training dataset of more than 10,000 newspaper articles that discuss immigration, same-sex marriage, and marijuana, containing 146,001 labeled text spans labeled with a single MediaFrame-class per annotator. To apply this dataset to our argumentative domain, we broaden the annotated spans to sentence level (see [here](https://www.degruyter.com/document/doi/10.1515/itit-2020-0054/html)). Since an argument can address more than a single frame, we design the argument-frame classification task as a multi-label problem by combining all annotations for a sentence into a frame target set. In addition, to broaden the target frame sets, we create new instances merging two instances by combining their textual representation and unifying their target frame set. |
|
|
|
## Used arguments for fine-tuning |
|
|
|
````txt |
|
per_device_train_batch_size=16, |
|
per_device_eval_batch_size=64, |
|
group_by_length=False, |
|
evaluation_strategy="epoch", |
|
num_train_epochs=5, |
|
save_strategy="epoch", |
|
load_best_model_at_end=True, |
|
save_total_limit=3, |
|
metric_for_best_model="eval_macro avg -> f1-score", |
|
greater_is_better=True, |
|
learning_rate=5e-5, |
|
warmup_ratio=0.1 |
|
```` |
|
|
|
## Performance |
|
|
|
On the test split of this composed dataset, we measure the following performances: |
|
|
|
````txt |
|
"test_macro avg -> f1-score": 0.7323500703250138, |
|
"test_macro avg -> precision": 0.7240108073952866, |
|
"test_macro avg -> recall": 0.7413112856192988, |
|
"test_macro avg -> support": 27705, |
|
"test_micro avg -> f1-score": 0.7956475205137353, |
|
"test_micro avg -> precision": 0.7865279492153059, |
|
"test_micro avg -> recall": 0.804981050351922, |
|
"test_micro avg -> support": 27705, |
|
```` |