Presentation: Data Science and Our Environment
Speaker: Francesca Dominici, Ph.D., Professor of Biostatistics, Co-Director of the Data Science Initiative, Harvard T.H. Chan School of Public Health
Abstract: What if I told you I had evidence of a serious threat to American national security--a terrorist attack in which a jumbo jet will be hijacked and crashed every 12 days. Thousands will continue to die unless we act now. This is the question before us today--but the threat doesn’t come from terrorists. The threat comes from climate change and air pollution.
We have developed an artificial neural network model that uses on-the-ground air-monitoring data and satellite-based measurements to estimate daily pollution levels across the continental U.S., breaking the country up into 1-square-kilometer zones. We have paired that information with health data contained in Medicare claims records from the last 12 years, and for 97% of the population ages 65 or older. We have developed statistical methods and computational efficient algorithms for the analysis over 460 million health records.
Our research shows that short and long term exposure to air pollution is killing thousands of senior citizens each year. This data science platform is telling us that federal limits on the nation’s most widespread air pollutants are not stringent enough.
This type of data is the sign of a new era for the role of data science in public health, and also for the associated methodological challenges. For example, with enormous amounts of data, the threat of unmeasured confounding bias is amplified, and causality is even harder to assess with observational studies. These and other challenges will be discussed.
Press coverage links
Di Q, Wang Y, Zanobetti A, Wang Y, Koutrakis P, Dominici F, Schwartz J. (2017). Air Pollution and Mortality in the Medicare Population. New England Journal of Medicine, 376:2513-2522, June 29, 2017, DOI: 10.1056/NEJMoa1702747