Data Is Plural is a weekly newsletter of useful/curious datasets. This edition, dated Feb. 22, 2023, has been republished with permission of the author.
Facilities handling hazardous chemicals. The U.S. Environmental Protection Agency’s Risk Management Program rule requires facilities that handle “extremely hazardous substances” to tell the government, at least every five years, about those substances, their safety plans, their recent accident history, and more. Through a FOIA request to the EPA, the Data Liberation Project (full disclosure: I run this project) obtained a copy of the agency’s database of these filings (minus some parts the government deems nondisclosable), containing submissions by 21,000-plus facilities from early 1999 to February 2022. You can now access that data, in various formats, along with documentation guiding you through it.
Animal Welfare Act inspections. The USDA’s Animal and Plant Health Inspection Service checks whether animal dealers, exhibitors, research facilities, and transporters are complying with the care standards set by the Animal Welfare Act. The agency provides public access to the inspection reports but no bulk data on them. So, in a collaboration between Big Local News and the Data Liberation Project (same disclosure as above), Ben Welsh and I wrote code to fetch the more than 80,000 (and counting) inspections going back to 2014, parse their PDFs, and make the data more accessible. The information includes each inspection’s date, type, licensee, violation counts, species inspected, and more.
Daily European gas imports. Researchers at Bruegel are tracking daily and weekly natural gas imports to Europe, using data from the European Network of Transmission System Operators for Gas’s transparency portal. Alongside the imports, which they’re aggregating by source (e.g., Russia, Norway, Algeria) and by route (e.g., Nord Stream, TurkStream), the researchers are also tracking gas storage levels, using data from Gas Infrastructure Europe (DIP 2022.01.26). Previously: Eurostat’s data on annual European energy imports and exports (DIP 2022.03.16).
Unclaimed estates. The U.K. government’s Bona Vacantia division publishes a dataset of unclaimed estates—inheritances that nobody has claimed yet. The entries indicate the deceased person’s name, aliases, date/place of birth and death, marital status, and more. Related: California provides a dataset of unclaimed property, such as “lost or forgotten” bank accounts, insurance benefits, and stock holdings.
Data journalists, surveyed. The European Journalism Center’s DataJournalism.com has published a dataset of 1,800-plus anonymized responses to its second annual State of Data Journalism Survey, including 50-plus entries each from the U.S., U.K., Italy, Germany, Spain, India, and Nigeria, plus double-digit counts from dozens of other countries. The questions touch on demographics, employment, training, skills, the COVID-19 pandemic, and more. [h/t Simona Bisiani]
Notice: Unlike most of our content, this edition of Data Is Plural by Jeremy Singer-Vine is not available for republication under a Creative Commons license.