Hello, friends,
The Markup is committed to both protecting the privacy of our readers and to publishing hard-hitting investigative journalism. But it would be naive to ignore the fact that privacy and journalism can sometimes be in conflict.
When I first started investigating the commercial surveillance industry (aka the online advertising industry that relies on stalking people to monitor their behavior), I remember one of my newsroom colleagues taking me aside and saying, “You know that stronger privacy rules don’t help journalists, right?”
Indeed, many of today’s privacy debates trace their lineage to a famous 1890 Harvard Law Review article decrying how “[i]nstantaneous photographs and newspaper enterprise have invaded the sacred precincts of private and domestic life.”
In that law review essay, “The Right to Privacy,” Boston lawyers Samuel D. Warren and future Supreme Court justice Louis D. Brandeis argued that because “the press is overstepping in every direction the obvious bounds of propriety and of decency,” courts should acknowledge a “general right of the individual to be let alone.”
Their proposal to criminalize invasions of privacy never went anywhere, but the tension Warren and Brandeis called out is real. A journalist’s job is to bear witness on behalf of a public that would not otherwise be able to see what the journalist sees. And that can mean that journalists go to great lengths to see things that others are trying to keep private.
Journalists routinely show up at people’s homes or call people’s relatives when they are trying to get someone to talk to them. They publish public records that contain information about individuals or their correspondence. They take photos of people at the worst moments of their lives.
Of course, ethically, journalists should only do these things when they serve the public interest. There are photos that the public needs to see to understand the wars they are funding or the suffering that government policies are promoting. And there are photos that nobody really needs to see.
At The Markup, we don’t do a lot of the traditionally invasive types of journalism. We take very few candid photos. We chase very few politicians down hallways. We rarely stake out people at home.
But we do collect a lot of data that can be personally revealing. We scrape publicly available internet sites. We peek inside website plumbing to expose privacy violations. We receive and publish leaked documents. We built a tool that collects data from people’s Facebook accounts. We have just launched a study with Mozilla that will, with users’ permission, collect info from sites that users visit on the web.
And in every case, we work hard to protect the privacy of people whose personal data is not relevant to our investigation while exposing the data that the public needs to see.
For our Citizen Browser project, we built a tool that collects data from participants’ Facebook news feeds in order to reveal how Facebook’s algorithms decide to amplify content. But because this data is so sensitive, we designed a method of redacting all the personal information—ranging from users’ names to photos to chats and comments from their friends—before we can even see the data. We delete the raw data soon after it is anonymized.
When we build tools for the public, we also try to protect the privacy of users of those tools. We built our Blacklight forensic privacy scanner, for instance, so that all the user has to do is submit a URL to our tool, and then we run privacy tests on the website from a browser running in the cloud. That means that the website never knows which user was seeking to test its privacy features.
For our “Facebook Pixel Hunt” project with Mozilla, which we announced last week, we made sure to build privacy in from the beginning. The Pixel Hunt is an effort to collect data about how Facebook tracks users across the internet using a pixel, an invisible tag embedded on websites. To see this pixel “in the wild,” we are recruiting Firefox users to install a tool that will monitor their web browsing and collect data about the behavior of the pixel.
Of course, web browsing data from individuals is highly sensitive. So it will be stored on Mozilla’s secure servers, and we will only be able to access it in an anonymized and aggregated format to ensure that personal identifiers can’t be extracted from it.
This framework is designed to let us view just the data that we need about the behavior of the Facebook pixel and its prevalence across the web while avoiding looking at how individuals themselves are browsing the web. This is the principle, which we share with Mozilla, of “data minimization”—collecting only the bare information needed.
Unfortunately there are no easy answers to exactly how much data is too much and how much is just right. Each time we do an investigation we weigh the options and come up with a custom solution for that project.
It is tricky work, and we likely won’t always get it right, but we are committed to doing large-scale automated data collection to expose otherwise unknowable systems and to protecting the private data of those who help us do so.
As always, thanks for reading.
Best,
Julia Angwin
Editor-in-Chief
The Markup
Additional Hello World research by Eve Zelickson.