Sherry Glied, New York University
Data and the Quest for Available Access
December 15, 2014 09:00 AM
By Sherry Glied, New York University
Data is the bedrock of our work as scholars and teachers of public policy and management and as we study and educate with the goal of improving the quality of policy and management. Our approaches vary, but whatever our discipline and methodology, we learn by observation.
Much of our research has always involved data we collect ourselves, through surveys, case studies, and interviews. Increasingly, however, improvements in computing power that enable the rapid analysis of much larger data sets as well as permitting linkages across large data sets, and refinements in the statistical methods that can be applied to data have allowed us to release much more knowledge from administrative and routinely-collected data. The secrets of large data can be unlocked (if not fully, at least more than before) using quasi-experimental methodologies like regression discontinuity designs, as in Brian Jacobs and Lars Lefgren’s analysis of the effects of remedial education on outcomes using routinely collected data from the Chicago public schools. Natural experiments can be exploited for research purposes by making use of appropriate administrative data, as Kate Baicker, Amy Finkelstein, and their colleagues have done in studying Oregon’s expansion of Medicaid by lottery. Effective use of administrative data can also enable researchers to conduct randomized controlled trials at very low cost, as in Roland Fryer’s study of teacher incentives in New York City and Adam Steventon, Martin Bardsley, John Billings and colleagues study of the effects of telehealth on the use of health services in the British National Health Service. While most analyses using these methods have examined questions of policy, they can be also be used to management issues, as Jeffrey Smith and James West show in their analysis of how changes in military retirement benefits affect decisions to remain in the military. Improved computing power that facilitates techniques such as Bayesian latent variable modeling can also help unlock the secrets of administrative data, as Anthony Bertelli and Christian Grose show in their recent studies of Cabinet secretary ideology.
The increasing opportunities to learn from data about questions that are critical to policymakers make data access ever more valuable. The Obama Administration has embraced the idea of open government and taken important steps to make survey data more accessible, but significant barriers to data access remain. In particular, most efforts to release data have focused on simplifying access to existing surveys. The data that have become most valuable, however, are administrative records from government programs, including data around the internal workings of the government.
Policymakers are understandably reluctant to make administrative data available. Government officials’ first priority is to administer programs effectively and efficiently for the benefit of their participants. While learning from programs can improve effectiveness in the long term, in the short run, making data available – along with readable dictionaries and responses to researcher queries – can be a costly and time-consuming distraction. Program participants and government employees alike have legitimate expectations of confidentiality and, for policymakers, the short run costs associated with possible breaches of privacy often outweigh the long run benefits associated with possible new policy and management insights. The mechanisms for limiting such breaches, which include de-identifying data, establishing restricted data enclaves, creating synthetic data, and using agency personnel to conduct analyses, do exist. All of these mechanisms, however, impose new costs in time and money.
Concerns of cost, distraction, and the effort of protecting confidentiality are real and legitimate. Yet we cannot let these considerations significantly impede data access – to do so would mean foregoing the possibility of making great advances in the programs we run and the ways that we run them. Administrative data stewards in government can learn from, and collaborate with, survey and statistical branches of government that have successfully made secure data available for research. For example, the Agency for Healthcare Research and Quality has long operated a researcher data center that permits scholars to access otherwise confidential data elements in its surveys. The Internal Revenue Service, whose data are among the most closely guarded of all federal administrative data, has established procedures to allow access to many highly restricted data sets through secure research data centers; the IRS has also established procedures that enable researchers to collaborate with IRS employees in analyses of the most confidential data.
Making data more readily available, so that researchers can better evaluate and assess programs, will require governments spend time and money. This endeavor may garner a sympathetic ear among policymakers themselves but is very unlikely to command broad and favorable public interest or attention. That makes it especially important that we come together as APPAM members to support this effort. An excellent first step is to consider APPAM’s resolution for data accessibility and transparency.
Sherry Glied is Dean and Professor of Public Service at New York University's Robert F. Wagner Graduate School of Public Service. Her principal areas of research are in health policy reform and mental health care policy.