Accounting for Racial Bias When Applying Predictive Analytics to Child Welfare

Research on implicit racial bias teaches us that individual-level, subjective decision-making practices are likely to generate biased outcomes. In fact, humans’ capacity to make objective, logical decisions is severely limited by a variety of systematic decision-making errors and external distractions.

Kelly Capatosto from the Kirwan Institute for the Study of Race and Ethnicity

As a way to interrupt the influence that individual biases have on decisions in the child welfare field, many have looked to the use of big data tools, such as predictive analytics. Recently, these tools have received tremendous support for offering a more data-driven approach to risk evaluation in child welfare cases, and the potential to save valuable time and financial resources.

For these reasons, predictive analytics may seem like a panacea for child welfare decision-making.

However, as researchers and advocates, we must retain a healthy skepticism of predictive analytics use in risk evaluation and address the ethical concerns they may pose. To illustrate, one of the most significant concerns presented by predictive analytics is the potential for these tools to conflate race with risk and perpetuate racial discrimination.

With an emphasis on cognitive and systemic racial bias, a recent report, by the Kirwan Institute for the Study of Race and Ethnicity at Ohio State University, briefly overviews some of the potential pitfalls that can occur if we are not cautious in how predictive models are designed and utilized in child welfare decisions. Among other concerns, the report highlights the possibility that predictive models can inherit the biases of human designers, reproduce existing social inequities, and blur the line between correlation and causation for users.

When considering how personal values or biases can be internalized in these predictive models, we must acknowledge that the very nature of how we define and measure risk in a child welfare settings is contingent on our personal values.

All too often people in our profession see headlines demonstrating the hazards of neglect or even death if we fail to correctly identify when a true risk is present (i.e. a false negative). Thus, one inclination is ascribing the most danger to the potential of a false negative when deploying predictive analytic tools.

However, the trauma associated with entering a child into the foster care system when no true risk is present (i.e. a false positive) should not be overlooked, and the risk of abuse and even death remains.

My intent behind these examples is not to influence readers in whether false positives or false negatives present a bigger threat, but to demonstrate the wide range of important values and goals that humans impose on how predictive models are designed and used.

Another common concern with predictive risk evaluation is the possibility of incorporating longstanding patterns of social inequity when seemingly objective data acts as a proxy for racial identity. For example, neighborhood-specific data such as zip code is deeply connected to historic practices of racial discrimination. Even today, our neighborhoods remain segregated and often reflect the legally endorsed practices of racial exclusion that occurred for decades before the Fair Housing Act was passed in 1968. In the same fashion, predictive models used in child welfare settings may over-represent those who have suffered from past marginalization as a having heightened risk.

Finally, risk prediction models that rely on correlations to estimate the likelihood of a future event may miss crucial information compared to studies that can make causal claims. For example, prior child protective services (CPS) intervention is often cited as a main predictor of recurrent maltreatment. Yet, despite this strong correlation to later child maltreatment, it is not the underlying cause. In this case, the non-causal nature of this relationship of CPS and maltreatment may seem like common sense. Yet other risk factors identified by these models, such as a missing parental figure or living in a low-opportunity neighborhood, are often implicitly perceived as causal, which can affect treatment recommendations.

In an ethical sense, this false perception of causality becomes even more precarious in instances where some cities have considered using racial information to identify children’s potential needs as early as infancy.

Despite my cautionary tone, the fact remains that these tools are not inherently good or bad. However, we must stay vigilant in monitoring how predictive risk models influence outcomes for our most vulnerable youth and demand that they be transparent and responsive to feedback if inequities do emerge.

On June 6 at 3:00 p.m. EST/12:00 p.m. PST, the Kirwan Institute and Fostering Media Connections, the parent organization of this publication, will host a free webinar exploring my findings and offer those interested a chance to weigh in. If you are interested in joining us, you can register for the Accounting For Implicit Bias In Predictive Analytics webinar here.


Kelly Capatosto is a research associate at the Kirwan Institute for the Study of Race and Ethnicity at Ohio State University.

Print Friendly, PDF & Email