The Unfairness of "Fair" AI: Understanding the Inescapable Trade-offs
Introduction: A Tale of Two Predictions
In courtrooms across the United States, a software tool called COMPAS has been used to help make critical decisions. Its purpose is to predict the likelihood of a defendant re-offending, offering judges a data-driven assessment of risk. The goal was to bring objectivity to a process fraught with human bias.
But in 2016, a groundbreaking investigation by ProPublica revealed a shocking and deeply unfair outcome. The algorithm, designed to be impartial, was systematically biased against Black defendants. The investigation found that “blacks are almost twice as likely as whites to be labeled a higher risk but not actually re-offend,” whereas COMPAS “makes the opposite mistake among whites: They are much more likely than blacks to be labeled lower-risk but go on to commit other crimes.”
How can an algorithm, a set of mathematical instructions, produce such a prejudiced result? The answer is not a simple glitch in the code. It lies in a fundamental conflict between different, competing ideas of what "fairness" actually means—a conflict that forces a series of inescapable trade-offs.
- The Two Faces of Fairness: Groups vs. Individuals
So, where do we even begin? The first challenge is realizing that 'fairness' isn't one thing. Ask ten different people what it means, and you might get ten different answers. In AI, this isn't just a philosophical problem—it's a technical one. To understand the core conflict, we'll focus on two of the most prominent—and often contradictory—approaches: ensuring fairness between groups and ensuring fairness for individuals. It's important to know this is a simplified map; the landscape of fairness includes many other metrics, like equalized odds and equal opportunity, each with its own assumptions and trade-offs.
Group Fairness Individual Fairness Goal: To ensure equal outcomes between different demographic groups (e.g., based on race or gender). Goal: To "treat similar individuals similarly." Key Concept: Statistical Parity (or Demographic Parity). This is achieved when the probability of receiving a positive outcome (like getting a loan or being hired) is the same across all protected groups. Key Concept: Fairness Through Awareness. This is achieved when two individuals with similar relevant characteristics (like qualifications or credit scores) receive similar predictions from the model. Simple Analogy: Imagine a company hiring new employees. To achieve statistical parity, if 10% of male applicants are hired, then 10% of female applicants must also be hired. Simple Analogy: Consider a scholarship program. Two students with nearly identical grades, test scores, and extracurricular activities should have a similar chance of being awarded a scholarship, regardless of their race or gender.
Both of these goals sound desirable on their own. We want our systems to avoid perpetuating historical group-based inequalities, and we want them to be consistent by treating similar people in a similar way. The real problem begins when we try to achieve both at the same time.
- The Inescapable Tug of War
The central challenge of fair AI is the inherent conflict between Group Fairness and Individual Fairness. This can be understood as a tug of war between two different ways of thinking about similarity:
- Relative Similarity (Group Fairness): This perspective says people with similar qualifications or characteristics within their own demographic group should be treated alike.
- Absolute Similarity (Individual Fairness): This perspective says people with similar qualifications or characteristics across the entire population should be treated alike, regardless of group.
Let's illustrate this conflict with a simple, yet powerful, example. Imagine a company that wants to achieve statistical parity (a Group Fairness goal) in its hiring process. To do this, it sets an equal hiring rate for two groups, Group A and Group B.
However, the company "hires diligently selected applicants" from Group A but "hires carelessly selected applicants" from Group B, all while maintaining the same hiring rate for both.
In this scenario, the company has successfully achieved its Group Fairness goal of equal hiring rates. But it has blatantly violated Individual Fairness. A highly qualified applicant from Group B might be rejected while a far less qualified applicant from Group A is hired, simply to meet the quota. This doesn't just treat similar individuals differently; it creates a 'self-fulfilling prophecy.' By hiring less-qualified people from Group B, the company actively manufactures a dataset that 'proves' its own bias, establishing a negative track record that could justify future discrimination.
Since we often cannot have perfect fairness for both groups and individuals while also maximizing accuracy, developers must navigate a difficult trade-off.
- The Fairness Frontier: Trading Accuracy for Equity
In AI development, forcing a model to satisfy a specific fairness rule—like equal hiring rates or equal error rates between groups—can sometimes reduce its overall accuracy or utility. This creates a direct trade-off that designers must manage.
This relationship is often visualized using a concept called the Pareto Frontier. Think of it as a curve on a graph that represents the set of best possible compromises. Along this curve, you can only get more of one thing (like fairness) by giving up some of another (like accuracy). There is no single "best" point on the frontier; instead, it represents a set of optimal choices, each with a different balance of priorities.
The graphs below, from a study on the LSAC dataset, perfectly illustrate this concept.
Let's break down what these graphs show:
- The Horizontal Axis (Wasserstein-2): This measures the level of unfairness, or statistical disparity, between groups. A lower number (moving left) means the model is more fair to different groups.
- The Vertical Axis (L²): This measures the model's error. A lower number (moving down) means the model is more accurate.
- The Ideal Point: In a perfect world, we would want a point in the bottom-left corner of the graph—representing zero error and zero unfairness. As the Pareto Frontier shows, that point is impossible to reach.
The dotted orange line on each graph is the Pareto Frontier. Notice its shape: as you move from right to left along the line to achieve greater fairness (a lower Wasserstein-2 distance), the line curves upward, indicating a higher model error (a higher L²). This visually confirms the trade-off: making the model fairer for groups comes at the cost of making it less accurate overall.
This isn't a problem that can be "solved" with better math. It's a fundamental tension. The critical question then becomes: who gets to decide which point on this frontier is the right one?
- Beyond the Code: The Human Choices Behind AI
It's a common misconception that algorithms are objective tools that discover a single "true" solution to a problem. The reality is that algorithms are entirely human constructs, and the model-building process is a complex one involving numerous subjective choices. Every one of these choices can shape the final model and the results it produces, impacting its fairness and accuracy.
So what does this process actually look like? Here are just a few of the critical human judgments involved:
- Defining the Goal: Before any code is written, designers must decide what quality they are trying to predict. What makes a "good employee" or a "high-risk defendant"? Since these concepts can't be measured directly, designers must choose a measurable "target variable" to act as a proxy—for example, using "future arrests" as a proxy for the risk of re-offending. This initial choice is a subjective judgment that frames the entire problem.
- Selecting the Data: Designers must choose which datasets to use for training the model. They have to decide how to handle missing information, what to do with outliers (extreme data points), and whether the data is truly representative of the population. Each of these decisions can either introduce or amplify existing biases.
- Choosing the Model: There isn't just one type of algorithm for a given task. Designers can choose from many different models, such as logistic regression, random forests, or neural networks. They must use their judgment to select one, weighing various trade-offs between complexity, interpretability, and performance.
Because there is no single "correct" model for any given problem, no individual is truly entitled to a specific outcome. The final algorithm is a reflection of the values and choices of its creators. The decision of where to operate on the Pareto frontier—choosing the balance between fairness and accuracy—is ultimately a subjective, ethical choice, not just a technical one.
- What This Means for Us: Key Takeaways
Understanding the trade-offs in AI fairness is crucial for anyone interested in the future of technology and society. Here are the core ideas to take away:
- Fairness is Not One Thing "Fairness" has multiple, often competing, definitions. The most common conflict is between Group Fairness (ensuring equal outcomes for different demographic groups) and Individual Fairness (ensuring similar treatment for similar individuals).
- There is Always a Trade-off You can't always have it all. Achieving one type of fairness often comes at the cost of another, or at the cost of the model's overall accuracy. This is not a flaw to be fixed, but an inherent tension that must be consciously managed.
- AI Fairness is a Human Problem Algorithms are not objective or neutral; they are built by people making subjective choices about goals, data, and trade-offs. Creating "fair AI" is therefore a sociotechnical challenge that requires us to think carefully about our societal values, not just to search for a perfect mathematical solution.