The Big Business of Tracking and Profiling Students

Hello, friends,

The United States is one of the few countries that does not have a federal baseline privacy law that lays out minimum standards for data use. Instead, it has tailored laws that are supposed to protect data in different sectors—including health, children’s and student data.

But despite the existence of a law—the Federal Educational Rights and Privacy Act—that is specifically designed to protect the privacy of student educational records, there are loopholes in the law that still allow data to be exploited. The Markup reporter Todd Feathers has uncovered a booming business in monetizing student data gathered by classroom software.

In two articles published this week as part of our Machine Learning series, Todd identified a private equity firm, Vista Equity Partners, that has been buying up educational software companies that have collectively amassed a trove of data about children all the way from their first school days through college.

Vista Equity Partners, which declined to comment for Todd’s story, has acquired controlling ownership stakes in EAB, which provides college counseling and recruitment products to thousands of schools, and PowerSchool, which provides software for K-12 schools and says it holds data on more than 45 million children.

Some of this data is used to create risk-assessment scores that claim to predict students’ future success. Todd filed public records requests for schools across the nation, and using those documents, he was able to discover that PowerSchool’s algorithm, in at least one district, considered a student who was eligible for free or reduced lunch to be at a higher risk of dropping out.

Experts told us that using a proxy for wealth as a predictor for success is unfair because students can’t change that status and could be steered into less challenging opportunities as a result.

“I think that having [free and reduced lunch status] as a predictor in the model is indefensible in 2021,” said Ryan Baker, the director of the University of Pennsylvania’s Center for Learning Analytics. PowerSchool defended the use of the factor as a way to help educators provide additional services to students who are at risk.

Todd also found public records showing how student data is used by colleges to target potential applicants through PowerSchool’s Naviance software using controversial criteria such as the race of the applicant. For example, Todd uncovered a 2015 contract between Naviance and the University of Kansas revealing that the school paid for a year-long advertising campaign targeting only White students in three states.

The University of Kansas did not respond to requests for comment. PowerSchool’s chief privacy officer Darron Flagg said Naviance has since stopped colleges from using targeting “criteria that excludes under-represented groups.” He also said that PowerSchool complies with the student privacy law and “does not sell student or school data.”

But, as we have written at The Markup many times, not selling data does not mean not profiting from that data. To understand the perils of the booming educational data market, I spoke this week with Roxana Marachi, a professor of education at San José State University, who researches school violence prevention, high-stakes testing, privatization, and the technologization of teaching and learning. Marachi served as education chair of the CA/HI State NAACP from 2019 to 2021 and has been active in local, state, and national efforts to strengthen and protect public education. Her views do not necessarily reflect the policy or position of her employer.

Her written responses to my questions are below, edited for brevity.

Roxana Marachi https://s3.amazonaws.com/revue/items/images/013/543/949/original/Marachi_profpic.png

Angwin: You have written that ed tech companies are engaged in a “structural hijacking of education.” What do you mean by this?

Marachi: There has been a slow and steady capture of our educational systems by ed tech firms over the past two decades. The companies have attempted to replace many different practices that we have in education. So, initially, it might have been with curriculum, say a reading or math program, but has grown over the years into wider attempts to extract social, emotional, behavioral, health, and assessment data from students.

What I find troubling is that there hasn’t been more scrutiny of many of the ed tech companies and their data practices. What we have right now can be called “pinky promise” privacy policies that are not going to protect us. We’re getting into dangerous areas where many of the tech firms are being afforded increased access to the merging of different kinds of data and are actively engaged in the use of “predictive analytics” to try to gauge children’s futures.

Angwin: Can you talk more about the harmful consequences this type of data exploitation could have?

Marachi: Yes, researchers at the Data Justice Lab at Cardiff University have documented numerous data harms with the emergence of big data systems and related analytics—some of these include targeting based on vulnerability (algorithmic profiling), misuse of personal information, discrimination, data breaches, political manipulation and social harms, and data and system errors.

As an example in education, several data platforms market their products as providing “early warning systems” to support students in need, yet these same systems can also set students up for hyper-surveillance and racial profiling.

One of the catalysts of my inquiry into data harms happened a few years ago when I was using my university’s learning management system. When reviewing my roster, I hovered the cursor over the name of one of my doctoral students and saw that the platform had marked her with one out of three stars, in effect labeling her as in the “lowest third” of students in the course in engagement. This was both puzzling and disturbing as it was such a false depiction—she was consistently highly engaged and active both in class and in correspondence. But the platform’s metric of page views as engagement made her appear otherwise.

Many tech platforms don’t allow instructors or students to delete such labels or to untether at all from algorithms set to compare students with these rank-based metrics. We need to consider what consequences will result when digital labels follow students throughout their educational paths, what longitudinal data capture will mean for the next generation, and how best to systemically prevent emerging, invisible data harms.

One of the key principles of data privacy is the “right to be forgotten”—for data to be able to be deleted. Among the most troubling of emerging technologies I’ve seen in education are blockchain digital ID systems that do not allow for data on an individual’s digital ledger to ever be deleted.

Angwin: There is a law that is supposed to protect student privacy, the Family Educational Rights Protection Act (FERPA). Is it providing any protection?

Marachi: FERPA is intended to protect student data, but unfortunately it’s toothless. While schools that refuse to address FERPA violations may have federal funding withheld from the Department of Education, in practice, this has never happened.

One of the ways that companies can bypass FERPA is to have educational institutions designate them as an educational employee or partner. That way they have full access to the data in the name of supporting student success.

The other problem is that with tech platforms as the current backbone of the education system, in order for students to participate in formal education, they are in effect required to relinquish many aspects of their privacy rights. The current situation appears designed to allow ed tech programs to be in “technical compliance” with FERPA by effectively bypassing its intended protections and allowing vast access to student data.

Angwin: What do you think should be done to mitigate existing risks?

Marachi: There needs to be greater awareness that these data vulnerabilities exist, and we should work collectively to prevent data harms. What might this look like? Algorithmic audits and stronger legislative protections. Beyond these strategies, we also need greater scrutiny of the programs that come knocking on education’s door. One of the challenges is that many of these companies have excellent marketing teams that pitch their products with promises to close achievement gaps, support students’ mental health, improve school climate, strengthen social and emotional learning, support workforce readiness, and more. They’ll use the language of equity, access, and student success, issues that as educational leaders, we care about.

Many of these pitches in the end turn out to be what I call equity doublespeak, or the Theranos-ing of education, meaning there’s a lot of hype without the corresponding delivery on promises. The Hechinger Report has documented numerous examples of high-profile ed tech programs making dubious claims of the efficacy of their products in the K-12 system. We need to engage in ongoing and independent audits of efficacy, data privacy, and analytic practices of these programs to better serve students in our care.

Angwin: You’ve argued that, at the very least, companies implementing new technologies should follow IRB guidelines for working with human subjects. Could you expand on that?

Marachi: Yes, Institutional Review Boards (IRBs) review research to ensure ethical protections of human subjects. Academic researchers are required to provide participants with full informed consent about the risks and benefits of research they’d be involved in and to offer the opportunity to opt out at any time without negative consequences. Corporate researchers, it appears, are allowed free rein to conduct behavioral research without any formal disclosure to students or guardians of the potential risks or harms to their interventions, what data they may be collecting, or how they would be using students’ data. We know of numerous risks and harms documented with the use of online remote proctoring systems, virtual reality, facial recognition, and other emerging technologies, but rarely if ever do we see disclosure of these risks in the implementation of these systems.

If corporate researchers in ed tech firms were to be contractually required by partnering public institutions to adhere to basic ethical protections of the human participants involved in their research, it would be a step in the right direction toward data justice.

P.S. Last week I mistakenly said that Gawker was forced to shut down after a defamation case, but it was actually an invasion of privacy case that brought them down (although they were also sued for defamation). Apologies!

As always, thanks for reading.

Best,
Julia Angwin
Editor-in-Chief
The Markup

Additional Hello World research by Eve Zelickson.