pheinisch
/

MediaFrame-Roberta-recall

@@ -13,12 +13,33 @@ tags:
 - RoBERTa-base
 ---
 A model for predicting a subset of MediaFrames given an argument (has not to be structured in premise/ conclusion or something else). To investigate the generic frame classes, have a look at [The Media Frames Corpus: Annotations of Frames Across Issues](https://aclanthology.org/P15-2072/)
 Also, this model was fine-tuned on the data provided by [this paper](https://aclanthology.org/P15-2072/). To be precise, we did the following:
 > To apply these frames to arguments from DDO, we fine-tune a range of classifiers on a comprehensive training dataset of more than 10,000 newspaper articles that discuss immigration, same-sex marriage, and marijuana, containing 146,001 labeled text spans labeled with a single MediaFrame-class per annotator. To apply this dataset to our argumentative domain, we broaden the annotated spans to sentence level (see [here](https://www.degruyter.com/document/doi/10.1515/itit-2020-0054/html)). Since an argument can address more than a single frame, we design the argument-frame classification task as a multi-label problem by combining all annotations for a sentence into a frame target set. In addition, to broaden the target frame sets, we create new instances merging two instances by combining their textual representation and unifying their target frame set.
 On the test split of this composed dataset, we measure the following performances:
 ````txt

 - RoBERTa-base
 ---
+# Model for predicting MediaFrames on arguments
 A model for predicting a subset of MediaFrames given an argument (has not to be structured in premise/ conclusion or something else). To investigate the generic frame classes, have a look at [The Media Frames Corpus: Annotations of Frames Across Issues](https://aclanthology.org/P15-2072/)
 Also, this model was fine-tuned on the data provided by [this paper](https://aclanthology.org/P15-2072/). To be precise, we did the following:
 > To apply these frames to arguments from DDO, we fine-tune a range of classifiers on a comprehensive training dataset of more than 10,000 newspaper articles that discuss immigration, same-sex marriage, and marijuana, containing 146,001 labeled text spans labeled with a single MediaFrame-class per annotator. To apply this dataset to our argumentative domain, we broaden the annotated spans to sentence level (see [here](https://www.degruyter.com/document/doi/10.1515/itit-2020-0054/html)). Since an argument can address more than a single frame, we design the argument-frame classification task as a multi-label problem by combining all annotations for a sentence into a frame target set. In addition, to broaden the target frame sets, we create new instances merging two instances by combining their textual representation and unifying their target frame set.
+## Used arguments for fine-tuning
+````txt
+per_device_train_batch_size=16,
+per_device_eval_batch_size=64,
+group_by_length=False,
+evaluation_strategy="epoch",
+num_train_epochs=5,
+save_strategy="epoch",
+load_best_model_at_end=True,
+save_total_limit=3,
+metric_for_best_model="eval_macro avg -> f1-score",
+greater_is_better=True,
+learning_rate=5e-5,
+warmup_ratio=0.1
+````
+## Performance
 On the test split of this composed dataset, we measure the following performances:
 ````txt