X marks either the bike shed, or the event-sourced Wikipedia

General Commentary, Shenanigans Add comments

Apr 302024

Chelsea Troy wrote a fantastic article last September titled “What do we do with the Twitter-shaped hole in the internet?” While I don’t subscribe to the premise that Twitter is “crashing,” there were about 2 articles worth of good reading in 1 URL, so it bears a lot of consideration (It’s worth noting that I have a much smaller follower count, get less activity on my tweets, and generally tweet less original stuff vs. just retweeting, so my mileage clearly varies). Troy does a good job of discussing the things Twitter does well, areas where it’s historically been weak from a fundamental “this is how the app was intended to run” perspective, as well as good comparisons and contrasts with other communications apps, and their designed limitations. She follows up with a great post walking through a potential design for a hypothetical new social application that would be very appealing. But in the process of designing her hypothetical application was a great discussion on identifying and promoting quality that is the part that really sticks out as particularly interesting.

The key concept around Troy’s application design is a setup where people create an account and subscribe to or follow 1 or more topics. Her next step is developing ways to identify and designate certain accounts as being reputable accounts to follow for high quality, accurate content. Her comments on promoting quality content are built around what she terms “the hijackability of peer recommendation systems” (as inspired by the Boaty McBoatface naming poll in the UK).

The goal is to have a set of accounts in each given topic that are considered reputable and thus get boosted within that topic. There are some benefits to this setup – first and foremost it prioritizes recommending people who know and understand the topic over people with the most entertaining opinions on the topic. It also avoids automatically recommending them for other topics by proxy because they’re considered knowledgeable in 1 domain. In fact, it actually lets people follow accounts for some content but not other content (i.e. I want to follow hypothetical person {X} for their opinions on software engineering, but I don’t really care about their fascination with the PGA – in other words, consumer-side content filtering).

My concern with this is that there isn’t a good way to determine the level of expertise needed to decide whether to recommend someone to the userbase at large at the application level. However, there are highly visible proxies that are dangerously tempting to use, such as degrees, accreditations, and job history (what Troy refers to early on in her article as “credibility-by-pedigree”). The first (and biggest) problem with these proxies is that the effectively serve as a means of gatekeeping the next generation of people like Troy who are incredibly smart but don’t have a resume with a “fancy” school or employer on it when they’re starting to build an online reputation. The other concern is that these proxies are being overweighted – perceptions of major societal institutions have been trending down, which means the signal being proposed to indicate reputability is increasingly considered noise by the very people Troy is proposing boosting this content to.

Troy does claim that she wants a system that avoids using these proxies, but doesn’t really define how to do that. To be fair, I don’t have a better idea either, because the things that most accurately determine reputability (are they generally right?) lie outside these applications and their purview. Troy’s thoughts do lead to having ultimately having everyone providing signals about who they consider reliable, with “reputable” users having more weight, and eventually triggering a manual review of subject matter experts when an account’s reputability weight crosses a certain threshold. The manual review being intended to address potential system gaming, but again, can just as easily turn into gatekeeping someone out of “reputable” status because they’re not “1 of us.”

The problem ultimately comes down to the fact that for this to work, we have to consider people who come to wildly different conclusions from us, or focus on different sets of data or even have different priorities from us as “reputable” – and we just don’t like to do that. Sad as it is, we as a society are not good at looking at something and acknowledging “I don’t agree with the analysis, but it’s logically coherent and consistent, and the author isn’t stupid.” It’s safe to assume that when people start talking about marking accounts on a site as “reputable,” what they’re talking about is flagging accounts as “I like these people (or at least they tend to think like me on the things I have opinions on), so you should listen to them too.” From the article:

That system should explicitly include alternatives to legitimacy-by-proxy, should address matters of diversity and inclusion, and should express a position on both-sides-ism so committees don’t end up forced, like journalists, to boost both the factually correct take and the yahoo one that too many people are talking about. If I find out onea’y’all felt the need to mark Fox News reputable, for example, this was done wrong.
What do we do with the Twitter-shaped hole in the internet?

When discussing the fact that there are multiple positions on most topics, Troy reflexively groups them into “factually correct” or “yahoo.” While I’m not saying that there aren’t crazy takes on any topic, the dismissive grouping into either “correct” or “crazy” makes it incredibly hard to believe that an application with this system in place would ensure that the “reputable” accounts are offering a complete analysis or discussion on any given topic.

Reputability is a proxy for trust and ultimately assignments of reputability are exercises in mandating trust, which is something people just don’t go along with. As Troy mentions when discussing letting all accounts indicate reputability, “…when someone included me in their Follow Friday list, the follow bump it provided to my account more or less tracked the amount of trust that that person had built among their followers.” (bolding mine, italics hers) Trust is built, and can’t be assigned. If people already trust you, then you can use that trust to recommend an account, and people can decide if that warrants looking over the content and giving it a try.

That said, the downside of applications that allow everybody to join and post on every topic without any commitment from the part of the poster, is that they turn every problem into a bike shed problem. Now that we have everyone commenting on everything, there is a very real question of how to get the most out of this. The default reaction is generally similar to Troy’s, designate experts as being more “reputable” on a topic and prioritizing what they say. The problem comes up when the experts disagree. Remember the Great Barrington Declaration back during Covid? That was written by experts, who were disagreeing with the official policies put in place by experts. As it turned out, some experts were apparently more equal than others, which brings us to the fundamental problem with the designated reputable system – official expert status has more to do with political influence than actual experience, practice, and being right.

Disagreements among people studying a topic aren’t new or even uncommon. They (along with the experiments and studies to test arguments) are how we eventually get to truth. Historically, we’ve observed the process, noted the arguments, and then we all (eventually) gravitated towards the correct answer. Tossing in opinions from literally everyone else (even though we’re not stakeholders) presents either the potential for greatness, or a lot of noise clouding the signal (probably both to be honest).

On the one hand, more people offering unsolicited takes gets you a lot of crap (we’ve all seen it). Pick any topic, and there’s a long list of people how have opinions, most of which aren’t offering anything of value to the discussion. On the other hand, this is how the wisdom of the crowds starts. Sure there’s a lot of junk to wade through, but the whole point about modern, user feed-driven sites is that they’re built to do that filtering for users. So long as there’s a good signal that some content is useful, the good stuff will find its way to users.

So is Twitter a proof that the infinite monkey theorem looks good on paper but is computationally infeasible? Maybe. Is Twitter a human-powered all-knowing collective intelligence, capable of working out the solution to any problem while showing all the output iterations on how the algorithm got us there? Maybe. Twitter, like all software, is fundamentally a reflection of the people using it and what they’ve decided they want to get out of it. Maybe we’ll take advantage of this “work in public” to find new people and organizations that can build more trust among the whole population. But that won’t come from manual designation of some set of anointed few. Reputability is all about trust, and trust is an inherently individual thing. Instead of trying to assign it, or even assuming it’s going to self-assign, maybe we should just assume that we’re all unreliable to some extent, and that the best way to get reliable information to the world is for us to have to show the world our work, learn how to explain it, and not tell them they have to trust whoever happens to be liked by the people in charge at the moment.

Even if you disagree with someone’s conclusions, that doesn’t mean their concerns aren’t valid or that they should be ignored. Idiots saying stupid stuff is a potential problem because when it’s a topic you don’t know anything about, stupid stuff and insight can look a lot alike. And that’s not counting the “for the lulz” crowd that will gladly jump in and try to hijack the conversation in the name of nonsense. On the other hand, that contrarian making posts that you think are stupid may actually have a point based on some piece of insight and experience that you haven’t considered. Software development started a movement of putting more perspectives on teams together to great success (when we do it for real instead of just paying lip service to it). Why can’t public discourse have it’s own cross-functional team?

Troy’s breakdown of a public posting application into accounts posting into topics (and feeds based on those topics) is an improvement that’s been needed for years. It also naturally improves the question of determining reputability by creating a clearer link to an individual and topic. This creates a mental association between a person’s comments and topic that avoids the issue of automatically assuming someone’s a reliable source for something because they’re reliable on another topic. The only real problem I see comes from trying to manually confer reputability, although Troy is very correct that just going on pure popularity is better for comedy than insight. We need voices that we can agree know what they’re talking about, but we also need to be OK with the fact that there’s also going to be public disagreements on just about everything. What’s important is that they’re had, and evaluated, in public.