Spaces:
Runtime error
Runtime error
import streamlit as st | |
title = "Hate Speech in ACM" | |
description = "The history and development of hate speech detection as a modeling task" | |
date = "2022-01-26" | |
thumbnail = "images/prohibited.png" | |
__ACM_SECTION = """ | |
Content moderation is a collection of interventions used by online platforms to partially obscure | |
or remove entirely from user-facing view content that is objectionable based on the company's values | |
or community guidelines, which vary from platform to platform. | |
[Sarah T. Roberts (2014)](https://yalebooks.yale.edu/book/9780300261479/behind-the-screen/) describes | |
content moderation as "the organized practice of screening user-generated content (UGC) | |
posted to Internet sites, social media, and other online outlets" (p. 12). | |
[Tarleton Gillespie (2021)](https://yalebooks.yale.edu/book/9780300261431/custodians-internet/) writes | |
that platforms moderate content "both to protect one user from another, | |
or one group from its antagonists, and to remove the offensive, vile, or illegal.'' | |
While there are a variety of approaches to this problem, in this tool, we focus on automated content moderation, | |
which is the application of algorithms to the classification of problematic content. | |
Content that is subject to moderation can be user-directed (e.g. targeted harassment of a particular user | |
in comments or direct messages) or posted to a personal account (e.g. user-created posts that contain hateful | |
remarks against a particular social group). | |
""" | |
__CURRENT_APPROACHES = """ | |
Automated content moderation has relied both on analysis of the media itself (e.g. using methods from natural | |
language processing and computer vision) as well as user dynamics (e.g. whether the user sending the content | |
to another user shares followers with the recipient, or whether the user posting the content is a relatively new account). | |
Often, the ACM pipeline is fed by user-reported content. Within the realm of text-based ACM, approaches vary | |
from wordlist-based approaches to data-driven, machine learning models. Common datasets used for training and | |
evaluating hate speech detectors can be found at [https://hatespeechdata.com/](https://hatespeechdata.com/). | |
""" | |
__CURRENT_CHALLENGES = """ | |
Combating hateful content on the Internet continues to be a challenge. A 2021 survey of respondents | |
in the United States, conducted by Anti-Defamation League, found an increase in online hate & harassment | |
directed at LGBTQ+, Asian American, Jewish, and African American individuals. | |
### Technical challenges for data-driven systems | |
With respect to models that are based on training data, datasets encode worldviews, and so a common challenge | |
lies in having insufficient data or data that only reflects a limited worldview. For example, a recent | |
study found that Tweets posted by drag queens were more often rated by an automated system as toxic than | |
Tweets posted by white supremacists. | |
This may be due, in part, to the labeling schemes and choices made for the data used in training the model, | |
as well as particular company policies that are invoked when making these labeling choices. | |
(This all needs to be spelled out better!) | |
### Context matters for content moderation. | |
*Counterspeech* is "any direct response to hateful or harmful speech which seeks to undermine it" | |
(from [Dangerous Speech Project](https://dangerousspeech.org/counterspeech/)). Counterspeech has been shown | |
to be an important community self-moderation tool for reducing instances of hate speech (see | |
[Hangartner et al. 2021](https://www.pnas.org/doi/10.1073/pnas.2116310118)), but counterspeech is often | |
incorrectly categorized as hate speech by automatic systems due to the counterspeech making direct reference | |
to or quoting the original hate speech. Such system behavior silences those who are trying to push back against | |
hateful and toxis speech, and, if the flagged content is hidden automatically, prevents others from seeing the | |
counterspeech. | |
See [van Aken et al. 2018](https://aclanthology.org/W18-5105.pdf) for a detailed list of examples that | |
automatic systems frequently misclassify. | |
""" | |
__SELF_EXAMPLES = """ | |
- [**(FB)(TOU)** - *Facebook Community Standards*](https://transparency.fb.com/policies/community-standards/) | |
- [**(FB)(Blog)** - *What is Hate Speech? (2017)*](https://about.fb.com/news/2017/06/hard-questions-hate-speech/) | |
- [**(NYT)(Blog)** - * New York Times on their partnership with JigSaw*](https://open.nytimes.com/to-apply-machine-learning-responsibly-we-use-it-in-moderation-d001f49e0644) | |
- [**(NYT)(FAQ)** - *New York Times on their moderation policy*](https://help.nytimes.com/hc/en-us/articles/115014792387-Comments) | |
- [**(Reddit)(TOU)** - *Reddit General Content Policies*](https://www.redditinc.com/policies/content-policy) | |
- [**(Reddit)(Blog)** - *AutoMod - help scale moderation without ML*](https://mods.reddithelp.com/hc/en-us/articles/360008425592-Moderation-Tools-overview) | |
- [**(Google)(Blog)** - *Google Search Results Moderation*](https://blog.google/products/search/when-and-why-we-remove-content-google-search-results/) | |
- [**(Google)(Blog)** - *JigSaw Case Studies*](https://www.perspectiveapi.com/case-studies/) | |
- [**(YouTube)(TOU)** - *YouTube Community Guidelines*](https://www.youtube.com/howyoutubeworks/policies/community-guidelines/) | |
""" | |
__CRITIC_EXAMPLES = """ | |
- [Social Media and Extremism - Questions about January 6th 2021](https://thehill.com/policy/technology/589651-jan-6-panel-subpoenas-facebook-twitter-reddit-and-alphabet/) | |
- [Over-Moderation of LGBTQ content on YouTube](https://www.gaystarnews.com/article/youtube-lgbti-content/) | |
- [Disparate Impacts of Moderation](https://www.aclu.org/news/free-speech/time-and-again-social-media-giants-get-content-moderation-wrong-silencing-speech-about-al-aqsa-mosque-is-just-the-latest-example/) | |
- [Calls for Transparency](https://santaclaraprinciples.org/) | |
- [Income Loss from Failures of Moderation](https://foundation.mozilla.org/de/blog/facebook-delivers-a-serious-blow-to-tunisias-music-scene/) | |
- [Fighting Hate Speech, Silencing Drag Queens?](https://link.springer.com/article/10.1007/s12119-020-09790-w) | |
- [Reddit Self Reflection on Lack of Content Policy](https://www.reddit.com/r/announcements/comments/gxas21/upcoming_changes_to_our_content_policy_our_board/) | |
""" | |
def run_article(): | |
st.markdown("## Automatic Content Moderation (ACM)") | |
with st.expander("ACM definition", expanded=False): | |
st.markdown(__ACM_SECTION, unsafe_allow_html=True) | |
st.markdown("## Current approaches to ACM") | |
with st.expander("Current Approaches"): | |
st.markdown(__CURRENT_APPROACHES, unsafe_allow_html=True) | |
st.markdown("## Current challenges in ACM") | |
with st.expander("Current Challenges"): | |
st.markdown(__CURRENT_CHALLENGES, unsafe_allow_html=True) | |
st.markdown("## Examples of ACM in Use: in the Press and in their own Words") | |
col1, col2 = st.columns([4, 5]) | |
with col1.expander("In their own Words"): | |
st.markdown(__SELF_EXAMPLES, unsafe_allow_html=True) | |
with col2.expander("Critical Writings"): | |
st.markdown(__CRITIC_EXAMPLES, unsafe_allow_html=True) | |