It may be time to rethink our approach to moderation

Sep 302018

Wil Wheaton recently left Twitter and gave Mastodon a try, only for it to not last very long. The “long story short” version of it was that as soon as he joined an instance and starting posting (or “tooting” as the Mastodon kids call it), people started spamming the report this post function until the instance’s admin had to remove him in an attempt to stop the flood of report spam. There’s an interesting blog post by Nolan Lawson about looking at this as an attack or harassment vector that got me thinking about some changes to community-level moderation tools that might help stem this sort of abuse.

One thing that makes this such a problem is that as an attack it doesn’t require coordination amongst the “attackers.” Anyone with a problem with somebody posting their thoughts and opinions online can flag that person’s posts and contribute to that person’s posts being less visible, or contributing to penalties with their account. The more such reports that come in, the more pressure on a moderator to find something wrong with the post in question – after all, multiple good actors can’t be wrong is likely the default assumption. The problem is online discourse (some may argue that all discourse but I’m focusing on online) has degraded so much that you can’t assume all actors are “good.” Some are malicious (reporting posts to troll someone they don’t like), and others are petty (reporting any post they don’t agree with regardless of whether or not it violates rules). Moderation tools are likely going to need to evolve to address this problem.

My first instinct was that it’s too easy to report posts on social applications, but users already have to state which rule or policy is being broken by the post, which was going to be my suggestion. This leads me to believe there aren’t strong enough penalties for abusing the reporting system. Just like people who repeatedly post content against the rules should face increasing penalties up to and including permabans, so should people who abuse reporting on posts.

For a system that punishes abusing the reporting system to work, there are a few things that need to be in place first. The most important thing is that the app’s rules need to be clear and unambiguous. There should be no question whether or not a post violates a rule. This makes it easier to determine when someone is making a report in bad faith, and reduce the role of individual moderator judgement in making decisions on what violates the rules and what doesn’t (which is what drives claims on bias in social apps).

Once you have clear, unambiguous rules that make it easy to tell what content isn’t allowed, the next step needs to be penalties for people for repeatedly flag acceptable posts. Given that there should be very little confusion about what is and isn’t against the rules, only a small percentage of posts should be “borderline” enough that a report on a post is rejected but not considered blatant enough to assume that whoever flagged it was acting maliciously. For everyone else, reporting more than a certain number of acceptable posts should instead trigger penalties for the reporters. At this point we can safely assume they’re causing more problems for both the moderators and community than they’re worth.

Ideally, these penalties would be automatically applied (making a moderator manually apply these is a terrible user experience). Moderators also need a way to flag reports on posts as either being on a post that was against this rules, borderline against the rules (meaning the report should be considered a legitimate, good-faith claim that the post in question might have been against the rules even if that wasn’t the final decision of the moderator), or on a valid post (the reporting user should have known the post wasn’t against the rules and is behaving badly). This enables moderation tools to act against people who abuse the reporting system while giving moderators an option to avoid punishing users they think are trying to follow the app’s rules.

Another thing needed to help deal with report abuse is moderation tools that groups reports for the same reason on the same post together (this may be a thing now – I’m not a moderator on a social app so I don’t know). These “attacks” rely on basically flooding moderators with the same thing. Grouping what are effectively duplicated reports so that a moderator can act on all the reports all at once means not only can moderators efficiently work through their report queues, but it also reduces the impact of bad actors flooding the system with bogus reports – after the first report everything is aggregated together.

Post reporting features were originally built with the assumption that the bad actors on an app were the ones posting, and that the only people with a reason to use those post reporting features were legitimate users trying to clean up the site. Now that people have realized that they can use the report feature to effectively ban users who otherwise hadn’t done anything wrong, it’s time to stop assuming that everyone who reports a post is pure and altruistic and give moderators the ability to punish users attempting to abuse the report process. Otherwise, we can expect to see more people banned as moderators have no choice but to cave to “report attacks” in order to preserve their sanity and stop the abuse from reporters.