Explainer9 min read•Tier 3

6 Surprising Truths About Algorithmic Fairness That Everyone Should Know

From module:M10—Individual & Group Fairness

Introduction: The Myth of the Objective Machine

We often hope that algorithms can make decisions free from messy human biases. The dream is a machine that sees only the facts, delivering outcomes that are purely objective and fair. But in reality, Artificial Intelligence often inherits, automates, and even amplifies our society's deepest biases.

The world of algorithmic fairness is far more complex and counter-intuitive than most people realize. It's a field filled with difficult trade-offs, philosophical debates, and surprising truths that challenge our assumptions about technology and justice. This article is a journey to uncover six of those truths that everyone interacting with our increasingly automated world should understand.

The "Objective" Algorithm Is a Complete Myth

The idea of a single, purely objective algorithm for any given problem is a complete misconception. Algorithms are not discovered like laws of nature; they are built. They are "malleable and contingent on the choices made by their creators."

The model-building process is a complex series of subjective human judgments. At every stage, designers must make consequential choices that shape the final result. These decisions include:

Defining the "target variable": What observable metric best approximates the quality you want to predict? For example, what makes a "good employee"—years on the job, or gross sales? There is no single correct answer.
Choosing data sources: Should you use a massive dataset with fewer data points per person, or a smaller one with more granular information but fewer examples from under-represented groups?
Handling imperfect data: What should be done about missing information or extreme outliers in a dataset? Should those records be removed, or should values be estimated?
Selecting the algorithm type: Should the model be a logistic regression, a random forest, or a neural network? Each uses different technical strategies for optimizing predictions.

The key implication of this process is profound: because "no single, definitive model that exists" prior to these choices, there is no clear baseline to compare against. This means individuals cannot claim they are entitled to a specific outcome from any one particular version of a model, especially one that encodes an unfair advantage.

You Can't Have It All: "Group Fairness" and "Individual Fairness" Are Often at War

In the world of AI ethics, "fairness" is not a single concept. There are multiple, often conflicting, definitions. Two of the most fundamental are group fairness and individual fairness, and they are frequently at odds.

Group Fairness: This concept, also known as "statistical parity," aims for equal outcomes or statistics across different demographic groups. For example, it might require a hiring algorithm to select a proportional number of applicants from different racial groups.
Individual Fairness: This concept aims to treat similar individuals similarly, based on their relevant attributes, regardless of which group they belong to.

These two goals, both desirable on their own, represent a fundamental, unavoidable paradox. To achieve fairness for the group, you must violate fairness for the individual in certain cases. An algorithm enforcing statistical parity across groups will inevitably have to treat two individuals with nearly identical qualifications differently simply because they belong to different demographic groups. This action, taken to achieve group fairness, directly violates the core principle of individual fairness. There is no mathematical workaround for this collision.

"...similar individuals are treated similarly."

"Colorblind" AI Is a Fallacy That Can Make Bias Worse

A common first instinct for achieving fairness is to simply hide sensitive attributes like race or gender from an algorithm. This "fairness through blindness" approach, however, is deeply flawed and can even make bias worse.

The primary reason this fails is due to "redundant encodings." Other data points in a dataset often serve as powerful proxies for the sensitive attribute that was removed. For example, a person's zip code can be highly correlated with their race, allowing the algorithm to perpetuate racial bias without ever "seeing" race directly.

This leads to a counter-intuitive but critical insight: to prevent discrimination, an algorithm may actually need to be aware of the sensitive attribute. Ignoring group membership can lead to a "self-fulfilling prophecy." For instance, imagine an organization seeks talented employees, but its model is trained on data from a majority group. The model might learn to define "talent" using signals common in that group's culture (e.g., specific extracurriculars or communication styles). By applying this narrow definition, it could overlook highly talented candidates from a minority group while selecting less-talented ones who happen to exhibit the majority-group signals, thereby creating a false "bad track record" that appears to justify future discrimination.

The failure of the "colorblind" approach reveals the central tension in algorithmic ethics: sometimes, to treat groups fairly, the algorithm must first be aware of the very groups we seek to protect, a direct challenge to the simplistic notion of treating every individual identically.

A Group of Untrained Humans Can Outperform a Major Recidivism Algorithm

COMPAS (Correctional Offender Management Profiling for Alternative Sanctions) is a well-known risk assessment tool used by U.S. courts to predict the likelihood of a defendant re-offending. Despite its widespread use, its performance reveals a startling truth about the limits and dangers of automated prediction.

The algorithm's predictive power is shockingly weak. A study found that the COMPAS algorithm has an accuracy of just 65%. This is only marginally better than the guesses of untrained individuals with little or no criminal justice expertise (63% accuracy) and is actually less accurate than the collective judgment of small groups of those same individuals working together (67% accuracy).

Not only is this widely-used tool barely better than a coin toss, it is also systematically biased. A landmark 2016 investigation by ProPublica found that the algorithm consistently makes critical errors that harm Black defendants at a much higher rate than white defendants. Specifically, "blacks are almost twice as likely as whites to be labeled a higher risk but not actually re-offend," while white defendants are more likely to be labeled lower-risk and go on to commit other crimes.

A Common "Fairness" Rule of Thumb Is an "Indefensible" Mistake

The "Four-Fifths Rule" (or "80% Rule") is a concept from U.S. federal employment law sometimes used as a rough guideline to check for disparate impact in hiring (i.e., to see if a hiring practice, even if not intentionally discriminatory, results in significantly different outcomes for different racial or ethnic groups). In machine learning, it is sometimes misapplied as a simple fairness threshold, where a model is considered "fair" if the selection rate for a protected group is at least 80% of the selection rate for the majority group.

Applying this rule as a universal fairness standard for algorithms is a serious error. Outside of its specific legal context in U.S. federal employment regulation, the rule has "no validity."

Its misapplication represents a form of what experts call "epistemic trespassing"—taking a concept from one specialized field and applying it improperly to another. The use of the Four-Fifths Rule as a simple fairness check in machine learning is, in the words of researchers, an "indefensible example of epistemic trespassing."

Correcting Algorithmic Bias Isn't "Affirmative Action"—It's Removing an Unfair Advantage

When developers make efforts to de-bias an algorithm, the process is sometimes mischaracterized as "algorithmic affirmative action." This framing is not only inaccurate but also fundamentally misleading.

This brings us back to the fundamental myth we dismantled in our very first point: the idea of a single, "true" algorithm. Because there is no "true" or "correct" objective model to begin with, the final model that gets deployed is the result of countless human choices, trade-offs, and value judgments. Therefore, de-biasing an algorithm is not about giving one group an unearned preference over another. It is more accurately understood as removing an unfair advantage that a biased model would otherwise have provided to the privileged group. No one is entitled to an outcome generated by a system that contains unfair bias.

The "affirmative action" frame is especially dangerous in contexts like criminal justice, where the system is not distributing opportunities (like jobs or loans) but is instead imposing harm (like incarceration). In a risk assessment context, a biased model doesn't just deny a benefit to one group; it actively and unfairly increases the likelihood of a punitive sanction for another. This re-frames the debate from one of preference to one of preventing unjust harm.

"The affirmative action frame reinforces the false notion that any steps taken to reduce bias or level the playing field for disadvantaged groups inherently harms white people and therefore requires special justification."

Conclusion: A New Frontier for Fairness

As we've seen, the seemingly simple goal of fairness is a labyrinth of contradictions. There is no "objective" algorithm to build upon (Truth #1), forcing us to confront warring definitions of fairness itself (Truth #2). Naive attempts at "colorblindness" only worsen the problem (Truth #3), and even a famous real-world algorithm can be both biased and less accurate than a group of amateurs (Truth #4). This complex reality means that simple checklists are indefensible (Truth #5) and that fixing bias isn't about giving preferences, but about removing built-in, unfair advantages (Truth #6).

Achieving fairness in AI is not a simple technical problem to be solved with more data or a better formula. It is a complex socio-technical challenge that requires deep engagement with context, values, and the people most impacted by these powerful systems.

This brings us to a final, crucial question. Given that algorithms are reflections of our own choices and values, what kind of society do we truly want them to build?