A first-generation college student from the Colombian coffee belt, Mauricio Sadinle deviated from the family business to pursue higher education. With help from an unlikely pen pal, he used statistics to quantify the toll of Colombia’s war with rebels. Now, he uses statistics to improve the quality of data and to unlock data's full potential.
What do you remember about growing up around coffee farms?
Running on top of the coffee beans. We lived in an area called “the coffee growing axis,” or Eje Cafetero in Spanish. In coffee production, after you pick the beans, you sun dry them by spreading them out over large areas. The parchment protects the beans, so my parents would let me run over them.
How did you see your future?
I never thought I would get so far in higher education. My father didn’t finish elementary school and my mother didn’t finish high school. My father grew up on a coffee farm and later had a variety of blue-collar jobs. He worked hard and over time, acquired his own farm. The natural path for me was to stay by their side. But while taking the national test after high school, the person watching over the room asked about my plans. He said, “Why don’t you apply for this university?” It was the National University of Colombia in Bogotá, and it was the hardest university in the country to get into. Studying there was not something I had ever envisioned, but I applied and I got in.
How did you discover your passion for statistics?
I was originally studying economics. I took some math and statistics classes, and realized that I wanted to do something more quantitative. In statistics, you use data and math to solve real-world problems. I really liked that.
How did you start to apply your studies?
My first job was with a group called the Conflict Analysis Research Center. It specializes in the study of armed violence and conflict analysis. We focused on the Colombian conflict, which began in the 1960s, and its impact on populations. It was a messy fight between the Colombian government, left-wing guerrillas such as the Revolutionary Armed Forces of Colombia, or FARC, and ultra-right-wing paramilitary groups. Because of the conflict, many families were forced to leave their homes.
What were you trying to find out?
We wanted to know how many people were displaced over time, from where they were displaced, and where were they resettling. There wasn’t a consolidated data source that would give us the correct answers. A government agency and the Catholic Church helped displaced families and, in the process, kept separate records on them. Unfortunately, there was no way of identifying a person uniquely across these data sources. For example, a social security number. This meant we had to figure out how to combine the records and then produce estimates of the number of displaced people.
How did this experience affect your career?
I had to read a lot of literature to understand the complexities of these problems. As a result, I stumbled upon research areas that would eventually become my focus as a graduate student, a post-doc and now as faculty. To combine the data sources, I had to learn about record-linkage techniques. To estimate the number of displaced people, I drew upon work in population-size estimation.
While researching, one name kept popping up, Stephen Fienberg (1942-2016). I didn’t know at the time that he was a famous statistician. I first contacted him about technical issues I was facing and my desire to solve them in a principled way. We emailed back and forth for the next two years. After my undergrad, I applied to Carnegie Mellon and eventually got to work with him in the areas of research that had connected us in the first place.
Tell us about the work you did with Dr. Fienberg.
Steve, as we called him, had a grant from the National Science Foundation’s Census Research Network. Their goal was to help improve the quality of the U.S. Census in the long term. We developed novel techniques for combining data sources through record linkage. For example, these methods could be used to combine the main census with other administrative sources or surveys as a way to correct for undercoverage (when some members of the population are inadequately represented in the sample).
Why did you decide to come to the UW?
Mauricio's father working on the coffee farm.
I really liked the people in the department and I was inspired by the work they were doing. I found the interplay between high-quality methodology work and high-impact applied work fascinating. I had not seen it in other places. SPH seemed like a great place to continue my methodological work, while engaging in important applications.
Also, my wife, Kristine Rominski, wanted to move to Seattle. She’s a classically trained flute player, but she also likes to play choro music (pronounced SHOH-roh). It’s an old style of instrumental Brazilian music and there’s a healthy group of people here who play it.
What research are you working on now?
I continue to work on methodological aspects of record linkage and analysis with missing data, the latter being an area of research I picked up during my postdoc with Jerry Reiter at Duke. Being in the faculty at SPH allows me to work on these topics with smart students and other faculty. For example, I’m interacting with PhD student Tigran Avoundjian, who wants to create a surveillance system for people with HIV to ensure they don’t fall out of care. He needs to combine different data sources in real time.
I’m beginning to collaborate with Donna Spiegelman, an epidemiologist at Harvard, who was involved with an intervention in Tanzania to encourage pregnant women to get checkups at the hospital. Unfortunately, their records are on paper, there is a high chance for errors and there is no unique identifier. If we want to know if the intervention had an effect on the number of visits, we need to identify how many visit records correspond to the same woman – and that’s record linkage.
I’m also connected with a nonprofit in San Francisco called the Human Rights Data Analysis Group. They are world leaders in applying quantitative research to human rights issues. I’m exploring collaborations with them to estimate mortality levels in different civil wars, such as in Syria. We’re planning to bring the director, Patrick Ball, to visit SPH in the fall.
What else are you doing?
I’ll be teaching a class on missing data methods next year and I plan to develop a course on data combination techniques in the near future. I also have a number of students taking independent studies with me. This quarter, I’m planning the departmental seminar, where I hope to expose folks to areas of statistics that I care about. Last quarter, I taught Computational Tools for Biostatistics, which is advanced programming for our PhD and master’s students.
What do you find most exciting about biostatistics?
I am new to the world of biostatistics and to public health, but one thing is clear: My colleagues develop high-quality statistical methodologies that are rooted in important real world-applications. This is inspiring and I want to do the same with my career.
- Genentech Distinguished Assistant Professorship, Biostatistics, UW
- Postdoctoral Associate, Statistical Science, Duke University
- PhD, Statistics, Carnegie Mellon University, 2015
- MS, Statistics, Carnegie Mellon University, 2011
- BS, Statistics, National University of Colombia, 2009