Explainer7 min read•Tier 3

The Fairness Paradox: Why an AI Can Be "Fair" to Everyone and No One at the Same Time

Standard practice in AI fairness auditing centers on single-axis analysis: does a system treat men and women equitably? Does it perform consistently across racial groups? This approach feels logical and has become a core part of the AI ethics toolkit.

But what if a system could pass both of these tests—be certified as "fair" across genders and "fair" across races—and yet still be profoundly discriminatory? This isn't a hypothetical flaw; it's a critical blind spot in how we evaluate AI. Biases can hide in plain sight, becoming visible only when we examine the overlaps between identities.

This article explores the most surprising and impactful truths about AI bias that are only revealed when we look at the intersections of identity. We will unpack how systems can appear fair on the surface while perpetuating deep-seated injustices against specific subgroups, and what that means for building truly equitable technology.

An AI Can Be "Fair" on Race and Gender Separately, But Deeply Unfair When Combined

The most critical concept to understand is that fairness doesn't simply add up. Biases hide in the intersections between attributes, and an AI model can appear to treat individual groups equitably while new and amplified biases emerge where those groups overlap.

Consider a machine learning model built for credit card approvals. To demonstrate this vulnerability, researchers tested a model using a dataset where attributes like income were synthetically modified to be conditional on race and gender. After training, the development team put the model through standard fairness checks. The initial results were encouraging. First, when tested for bias across men and women, the model showed no indication of unfairness. Next, it was tested for racial bias across five categories. Under the company's chosen "group benefit" metric—a metric focused on whether groups benefit at a rate similar to their qualifications—the model was given a passing grade.

With both single-axis tests passed, the team was confident in their model's fairness. But when a more granular analysis was performed, a disturbing picture emerged. At the intersection of race and gender, the model failed catastrophically across all metrics. It performed very poorly for certain subgroups, such as Pacific Islander men, while performing much better for others, like Caucasian men. Conventional assessments that only check one attribute at a time would have completely missed this disparate treatment, inflicting real, yet easily overlooked, harm on specific communities.

The Concept of "Intersectional Bias" Comes from Law, Not Computer Science

The idea that overlapping identities create unique experiences of discrimination is not a discovery from the world of AI. The term "intersectionality" was coined in 1989 by legal scholar Kimberlé Crenshaw to describe a phenomenon she observed in the legal system.

Her core insight was that Black women face a unique form of discrimination that is distinct from both the racism that Black men experience and the sexism that white women experience. It is not an additive effect but a compound one. Crenshaw's foundational insight, as summarized in later analyses, was that:

...discrimination against black women cannot be explained as a simple combination of misogyny and racism, but as something more complicated.

This was vividly illustrated in the 1976 legal case of DeGraffenreid v. General Motors. A group of Black women sued the company for employment discrimination. The court dismissed their case by looking at race and sex as separate issues. The court reasoned there was no race discrimination because General Motors hired Black people (men), and there was no sex discrimination because the company hired women (white women). By refusing to see Black women as a distinct class, the court made the specific discrimination they faced legally invisible.

Decades after a court demonstrated this critical blind spot in legal reasoning, we are now systematically programming the exact same failure of imagination into our automated decision-making systems, codifying old injustices in new lines of code.

A Highly "Accurate" AI Can Be a Perfect Engine for Perpetuating Historical Injustice

In machine learning, high accuracy is often treated as the ultimate goal. An accurate model, we assume, is a good model. But this assumption is dangerous when the data used to train the model is a reflection of a historically unjust world.

If an AI is trained on historical data that reflects societal biases, a highly "accurate" model is one that has learned to replicate those biases perfectly. Consider a classifier designed to predict students' academic performance based on historical data. Such a model is not learning objective performance, but rather the outcomes of a system where, as one analysis puts it, historical injustices have already stacked the deck:

“…that racism (and classism, homophobia, etc.) has made people physically, mentally, and spiritually ill and dampened their chance at a fair shot at higher education (and at life and living).”

A model that is highly accurate at predicting outcomes based on a biased past is not a fair decision-making tool. When we ask an AI to be "accurate" in predicting the results of an unjust history, we are asking it to be a perfect engine for perpetuating that injustice.

Even Huge Datasets Can Hide Critical Biases of Omission

True fairness depends not just on the quality of the data but on its completeness. An AI model can only be as fair as the data it is trained on, and if that data is not representative of the entire population, the model's blind spots will inevitably fall on those who are left out.

The US Department of Veterans Affairs (VA) provides a powerful case study. The VA has a massive database covering roughly 9 million veterans, making it one of the largest integrated healthcare systems in the country. Yet, this is only about half of the estimated 18 million veterans in the United States.

This creates a significant risk. AI applications trained on this vast but incomplete dataset may not produce reliable or effective recommendations for all veterans. Vulnerable populations, such as homeless veterans, are particularly at risk of being excluded from the VA system and, therefore, from the data used to train these models. The very people who may need the most support are the ones the AI is least likely to "see." This is not merely a data gap; it is an intersectional failure, where the very life circumstances that make these veterans most vulnerable also render them invisible to the systems designed to help them.

Conclusion: Beyond the Checklist—Asking the Right Questions

Achieving true fairness in AI requires us to move beyond simplistic, single-axis checklists and embrace the complexity of human identity and experience. It means recognizing that an AI can be technically "fair" by conventional metrics while inflicting real, yet easily overlooked, harm on specific communities.

This shift demands that we look for hidden harms at the intersections, that we question the completeness and history of the data we use, and that we rethink our definition of what makes a "good" model. It's not enough to ask if our AI is fair on average; we must ask who, specifically, it might be failing.

As we delegate more decisions to algorithms, are we building systems that see people in their full complexity, or are we designing a future where only the simplest identities are treated fairly?