Kristian Lum is the Lead Statistician at the Human Rights Data Analysis Group (HRDAG), where she leads the HRDAG project on criminal justice in the United States. Previously, Kristian worked as a research assistant professor in the Virginia Bioinformatics Institute at Virginia Tech and as a data scientist at DataPad, a small technology start-up. 

Kristian’s research primarily focuses on examining the uses of machine learning in the criminal justice system and has concretely demonstrated the potential for machine learning-based predictive policing models to reinforce and, in some cases, amplify historical racial biases in law enforcement. She has also applied a diverse set of methodologies to better understand the criminal justice system: causal inference methods to explore the causal impact of setting bail on the likelihood of pleading or being found guilty; and agent-based modeling methods derived from epidemiology to study the disease-like spread of incarceration through a social influence network.  Additionally, Kristian’s work encompasses the development of new statistical methods that explicitly incorporate fairness considerations and advancing HRDAG’s core statistical methodology—record-linkage and capture-recapture methods for estimating the number of undocumented conflict casualties. 

She is the primary author of the dga package, open source software for population estimation for the R computing environment.

Kristian received an MS and PhD from the Department of Statistical Science at Duke University and a BA in Mathematics and Statistics from Rice University.

Talk: "Bias In, Bias Out"

Watch this talk

Abstract: Predictive models are increasingly used in the criminal justice system to try to predict who will commit crime in the future and where that crime will occur. But what happens when these models are trained using biased, or non-representative, data? In this talk, I will introduce a no longer so recently published model used for location-based predictive policing. Using a case study from Oakland, CA, I will demonstrate how predictive policing would not only perpetuate the biases that were previously encoded in the police data, but – under some circumstances – could actually amplify those biases.