Investigating YouTube’s Ideological “Rabbit Hole”

Hello, friends,

A few weeks after Donald Trump’s surprise victory in the 2016 U.S. presidential election, a former Google engineer, Guillaume Chaslot, published an analysis claiming that YouTube’s algorithm had pushed pro-Trump videos more than pro-Clinton videos during the election season.

“On the eve of the US Presidential Election, we gathered recommendation data on the two main candidates, and found that more than 80% of recommended videos were favorable to Trump, whether the initial query was ’Trump’ or ’Clinton,’ ” Guillaume wrote in a Medium post. “A large proportion of these recommendations were divisive and fake news.”

That was the beginning (as far as I can tell) of years worth of analyses showing that YouTube’s recommendation system pushed conservative content. Chaslot built a database, AlgoTransparency, to identify fake news (such as the earth is flat) promoted by YouTube’s algorithm. In March 2018, New York Times opinion writer Zeynep Tufekci declared that the video site was promoting right-wing content in an article titled “YouTube, the Great Radicalizer.” In June 2019, New York Times reporter Kevin Roose chronicled how YouTube videos transformed a young man, Caleb Cain, in a front page investigation titled “The Making of a YouTube Radical.”

Researchers weighed in as well. Becca Lewis, a Ph.D. candidate at Stanford University and former researcher at Data & Society who studies YouTube, described what she called the “Alternative Influence Network” of pundits, scholars, and celebrities who use YouTube to promote right-wing positions ranging from libertarianism to White nationalism. Lewis argued that the algorithm was part, but not all, of how YouTube helped this content reach new audiences.

In 2019, YouTube announced it would “begin reducing recommendations of borderline content” such as “phony miracle cure for a serious illness, claiming the earth is flat, or making blatantly false claims about historic events like 9/11.” However, the company said those videos would still be available on the platform.

In 2020, researchers at NYU set out to find out if YouTube’s algorithm was still pushing conservative content. Two years later, we have their results. The study’s authors—Megan Brown, James Bisbee, Angela Lai, Richard Bonneau, Jonathan Nagler, and Joshua A. Tucker—found that for most users the algorithm did not lead to extremist rabbit holes, but that it did push users into “increasingly narrow ideological ranges of content in what we might call evidence of a (very) mild ideological echo chamber.”

When I asked YouTube for comment on the NYU study, spokesperson Elena Hernandez sent me a link to a page describing how YouTube’s recommendation system works and did not comment on the study.

I asked researcher Megan Brown to explain the findings and what they mean for our understanding of YouTube’s content recommendation systems. Brown is a senior research engineer and research scientist at NYU’s Center for Social Media and Politics. She studies cross-platform media manipulation, political bias in algorithmic systems, and the effect of platform governance and moderation policies on the spread of political content.

Our conversation, edited for brevity and clarity, is below.

Megan Brown

Angwin: For years there’s been this idea that the internet has ideological rabbit holes. If I understand your research correctly, it shows that rabbit holes are not as prevalent online as we thought. Can you explain your findings?

Brown: In our study we decided that a rabbit hole constituted content above a particular threshold, which was based on the fifth and 95th percentiles in the ideological distribution of videos people were recommended: the 5 percent most-left videos that people saw and the 5 percent most-right-leaning videos people saw.

In terms of the rabbit hole work, as you said, in the popular media there is a lot of speculation about recommendation algorithms, especially YouTube’s, given that it’s the largest recommendation algorithm. We found that the recommendation algorithm is not driving most users into rabbit holes. We did find that a few users in our sample, about 3 percent of the users in our study, did go down what we classified as a rabbit hole, meaning the ideology of their recommendations became more extreme and more narrow over the course of the task that we assigned them to do during the survey. But this idea that everyone is falling into a rabbit hole, and that if you’re on YouTube you will at some point be radicalized into the far right, we didn’t find evidence for that.

There hasn’t been a ton of evidence in this area, in part because it’s really difficult to get recommendation data. YouTube doesn’t make recommendation data publicly available to anyone, let alone researchers. A lot of the work that’s been done on YouTube has been using bots to audit the YouTube algorithm. One of the challenges with this method is that the recommendations that you get from a bot don’t have a user history, they don’t have account history, they’re not actually engaging with the video. Therefore, you can say something about the default settings of the YouTube recommendation algorithm, but that’s not actually how people are encountering recommendations in their day-to-day life.

Angwin: You overcame the challenge of analyzing YouTube recommendations by using a panel. How did that work?

Brown: We recruited a panel of around 530 people on Facebook using ads. Using a browser extension, we collected recommendations from users that they saw during the survey. This is a convenience sample, so we’re not doing population weighting. Convenience samples online tend to lean toward younger, more-educated, more-tech-savvy Democrats. That’s definitely where our sample leans, but about a third of our respondents were Republicans, so we have a pretty sizable amount to analyze the difference between conservatives and liberals.

One of the challenges with doing this type of work is that we do have to recruit from people online and get them to install a plugin, and that does bias our sample in particular ways. However, this describes what was happening to a sample of real users in 2020 when we did this study. Another issue is that we think YouTube is updating their algorithm pretty frequently. Our study looks at what was happening in 2020. If we look at the same data now, we might find the same thing, we might find something different. Our study provides a really good framework for doing this type of work, but there are more studies to be done to figure out how specific our findings were to the sample that we recruited and the time period that we were running it in.

Angwin: You also looked at echo chambers on YouTube. What did you discover?

Brown: We defined echo chambers as instances when people were getting recommendations that were ideologically biased but centered around their own ideology. So liberals getting recommended more liberal content and conservatives getting more conservative content, with not a lot of overlap between liberals and conservatives. We found that there’s a very mild but statistically significant difference between what conservatives are recommended and what liberals are recommended. However, when we look at the magnitude of the difference, it’s really small. We looked at graphs where we’re plotting these distributions, and there’s quite a bit of overlap in the ideology of the recommendations. In doing this, we did find that conservatives got slightly more conservative content and liberals got slightly more liberal content.

Angwin: Did YouTube’s algorithm lean ideologically more one way or another?

Brown: We found that even though there was slight evidence for echo chambers and there wasn’t a lot of evidence for rabbit holes, that by and large people’s recommendations over the course of the survey tasks they completed started leaning slightly conservative. We placed it around The Wall Street Journal level of conservative in terms of the left to right spectrum. That was a finding we were pretty surprised by: that people were, on average, regardless of their own ideology, more likely to go toward conservative content as the recommendation tasks continued. I would definitely consider this ideological bias, but I also wouldn’t consider The Wall Street Journal to be particularly extreme.

Angwin: Do you believe the engagement-based recommendation algorithm is itself a flawed idea?

Brown: In a lot of contexts it is very benign. If you really like Beyoncé videos, it’s pretty benign to give you more Beyoncé videos. If you’re in an echo chamber of jazz music, I’m not that concerned. I think in the democracy, elections, news, public opinion arena, I don’t know that we have conclusive evidence suggesting that there should be a tweak in engagement-based recommendations for this type of content specifically, or there should be a tweak in what content gets down-weighted or up-weighted in recommendations entirely.

I think one thing that gets lost in many conversations related to engagement-based algorithms being good or bad is that we don’t often talk about “compared to what.” Every content recommendation system follows some type of algorithm, whether it’s recommending stuff based on engagement (like YouTube’s recommendation system or Facebook’s news feed), based on relevance to a set of keywords (like Google search) or ordered based on the time that each piece of content was created (e.g. reverse-chronological timelines). Each of these algorithms has different sets of tradeoffs: Algorithms like reverse-chronological algorithms are more transparent to users and creators but easier to manipulate; engagement-based algorithms may amplify harmful content that people are more likely to engage with.

Before we do away with engagement-based recommendation systems because they amplify harmful content, we need to know if the alternatives also amplify harmful content, or maybe they amplify different harmful content. Obviously we should be using the recommendation system that leads to the least amount of harm, but it isn’t clear from the research yet what that system is.

Angwin: There was a 2019 article in The New York Times about rabbit holes, which described how one man was radicalized through YouTube. Should we understand your study to suggest that this was an outlier?

Brown: As I mentioned earlier, we did find that 3 percent of people fell into what we classified as rabbit holes. That 3 percent of people are outliers, but 3 percent of all the people on YouTube, which is used by 81 percent of Americans, is a really big number of people. In terms of whether it’s an outlier, I would say yes, but it is a very concerning outlier.

Other studies have found similar results: that overall consumption of harmful content, hate speech, and alt-right content is concentrated around a really small number of really dedicated people. There was one study on Twitter that showed that exposure to fake news was concentrated among a really small number of people who engaged almost exclusively with fake news. This is a subset of research that’s becoming increasingly important.

As always, thanks for reading.

Best,
Julia Angwin
The Markup

(Additional Hello World research by Eve Zelickson.)