Blog Post9 min read•Tier 3

From 'What' to 'Why': A Guide to Causal Fairness

Introduction: The Fairness Illusion

An algorithm shows different outcomes for different groups. Is it discriminating?

This question seems simple, but the answer is surprisingly complex. Standard fairness metrics, often used to audit algorithms, can tell us what a disparity is—for example, that a hiring tool recommends male candidates more often than female candidates. What they cannot tell us is why that disparity exists.

Consider two scenarios for a gender gap in hiring that look identical from a purely statistical perspective:

Scenario A (Direct Discrimination): A manager sees 'female' on the resume and rejects the application due to explicit bias. Here, the hiring decision is a direct causal result of the protected attribute.
Scenario B (Structural Pathway): Gender influences a candidate's field of study, which in turn affects the skills listed on their resume, which then influences the hiring decision.

Both scenarios can produce the exact same violation of a standard fairness metric, yet they tell fundamentally different stories about justice and accountability. This is because discrimination, in both legal and moral terms, is not a statistical observation but a causal claim: it alleges that an outcome occurred because of a protected attribute. To build systems that are genuinely fair, we must move beyond observing correlations and learn to reason about causes.

To see why these two scenarios demand different responses, we first need to understand the fundamental limits of looking at statistics alone.

The Limits of Statistical Fairness

Correlation-based fairness, often called Statistical Parity or demographic parity, is the most common starting point for fairness analysis. It asks a simple question: "Are outcomes different across groups?" If a model approves loans for 80% of applicants from one group but only 60% from another, it violates statistical parity. While this measurement is a crucial signal, it can be dangerously misleading because correlation is not causation.

The Pitfall of Confounding

The primary reason statistical metrics can mislead us is the presence of confounders. A confounder is a hidden variable that influences both the protected attribute and the outcome, creating a "spurious" or misleading correlation between them.

To see how confounding works in practice, consider a hiring algorithm where we observe that candidates with conservative political beliefs (A) are hired (Y) more often than candidates with liberal beliefs. This looks like discrimination. However, what if a third variable, Socio-Economic Status (Z), is at play?

High SES (Z) influences a person to hold more conservative political beliefs (A).
High SES (Z) also gives a person access to better training and educational opportunities, making them more likely to be hired (Y).

In this case, the observed correlation between political belief (A) and being hired (Y) isn't caused by the hiring algorithm discriminating based on political belief. It's caused by the confounding variable Z influencing both. An analysis based purely on statistical parity would incorrectly conclude that political belief is a driver of hiring decisions, potentially leading to a misguided intervention that fails to address the real root cause: disparities in socio-economic status. Simply measuring the statistical relationship P(Y|A)—the probability of being hired given a political belief—gives a biased and incorrect conclusion about fairness.

Because hidden factors can easily mislead us, we need a formal language to map out and reason about these complex relationships.

Thinking in Causes: A New Language for Fairness

Causal inference provides the tools to move beyond simple observation (seeing) to active intervention (doing). The primary tool for mapping our assumptions about how the world works is the Directed Acyclic Graph (DAG). A DAG is a visual map of causal relationships.

Deconstructing a DAG

A DAG has three core components that help us tell a clear causal story:

Nodes (Vertices): Represent variables (e.g., Gender, Education, Hiring Decision).
Edges (Arrows): Represent direct causal relationships. An arrow from A to B means A causes B.
Acyclic Property: The arrows never form a closed loop, meaning a variable cannot be its own cause.

Visualize a Causal Story

Let's visualize a plausible causal story for a hiring decision using a simple DAG:

    Gender (A)
     /      \
    /        \
   v          v

Education Prior Salary | | v v Skills ----> Hiring Decision (Y)

This graph encodes several assumptions: we believe gender might influence educational choices and prior salary, and that education (via skills) and prior salary in turn influence the hiring decision. Crucially, there is no direct arrow from Gender to Hiring Decision.

The Power of Intervention

Herein lies the core difference between correlation and causation.

Correlation (P(Y|A)): This is the observational probability. It answers: "Among people with attribute A, what percentage have outcome Y?" This is what we see in the raw data.
Causation (P(Y|do(A))): This is the interventional probability. It answers: "If we could hypothetically change a person's attribute to A, what would be the probability of outcome Y?"

The do-operator represents a hypothetical 'surgical intervention.' It's not just observing people who happen to have attribute A; it's asking what would happen if we could change an individual's attribute to A while severing all the causal arrows that normally point to it. This allows us to isolate the downstream effect of the attribute itself, which is the essence of a true causal question.

With this causal language, we can now dissect a statistical disparity and trace the specific pathways through which discrimination operates.

Disentangling Discrimination: Fair and Unfair Pathways

One of the most powerful features of using causal models is the ability to analyze the different causal paths through which a protected attribute influences an outcome. This allows us to distinguish how an effect happens, not just whether it happens.

Defining Causal Paths

We can break down the influence of a protected attribute into three main types of effects.

Effect Type Definition Relevance to Fairness Total Effect (TE) The overall impact of changing a protected attribute on an outcome, considering all causal paths combined. Measures the total disparity but doesn't explain its sources. Can be misleading if it mixes fair and unfair influences. Direct Effect (DE) The effect of the protected attribute on the outcome that does not pass through any other intermediate variable. Represents the causal path A → Y. This is often considered the clearest form of direct discrimination and is typically impermissible. Indirect Effect (IE) The effect of the protected attribute on the outcome that flows through one or more intermediate variables (mediators). Represents paths like A → Mediator → Y. The fairness of this effect depends entirely on the nature of the mediator variable.

The Importance of Indirect Paths

The most critical insight from path analysis is that not all indirect paths are unfair. The legitimacy of an indirect effect depends entirely on the mediating variable. We must distinguish between:

Unfair Proxy/Redlining Variables: These are mediators that simply carry forward or amplify discrimination. For example, if a hiring process used a candidate's hobby as a factor (Gender → Hobby → Hiring), and certain hobbies were strongly associated with a specific gender, the hobby would be acting as an unfair proxy for gender.
Fair Explaining Variables: These are mediators that represent legitimate qualifications or justifications for a decision. For example, in the path Gender → Education Level → Hiring, if a specific level of education is a genuine requirement for the job, this path may be considered acceptable or fair, even if it contributes to a statistical disparity between groups.

Crucially, a causal graph does not tell us which pathways are fair; it only gives us a map of the mechanisms at play. The determination of whether a mediator like 'Education Level' is a 'fair explaining variable' or an unfair structural pathway is a normative, context-dependent decision that requires domain expertise and ethical deliberation. The power of causal fairness is that it forces this debate into the open, making our assumptions about legitimacy explicit.

While analyzing these pathways helps us understand group-level dynamics, fairness often comes down to what happens to a single individual.

The Ultimate Question: Counterfactual Fairness

We now arrive at the most precise and challenging formulation of fairness, which probes the heart of individual discrimination. It is the counterfactual question, embodied in the legal 'but-for' test:

"For this specific person, would the outcome have been different but for their protected attribute?"

This question asks us to imagine a parallel world—one where an individual's protected attribute is changed, but all other independent background factors remain the same.

The Formal Definition

This idea is formalized in the definition of Counterfactual Fairness.

Predictor Ŷ is counterfactually fair if, for any individual, the prediction would be the same had their protected attribute A been different, while holding constant all factors not causally dependent on A.

The key insight here is that counterfactual fairness is an individual-level definition. In essence, this means we imagine a world where only the protected attribute and its direct and indirect consequences are changed, while all independent background factors (e.g., socioeconomic factors that are not themselves caused by the protected attribute in our model) are held constant. This stands in sharp contrast to statistical parity, which is a group-level metric that can only compare aggregate statistics.

This powerful, if challenging, concept brings our analysis to its ultimate destination: moving from the broad statistical patterns of groups to the profound causal questions of individual justice.

Conclusion: The Power of Asking 'Why'

Our journey has taken us from the simple observation of statistical disparities to a more nuanced, causal understanding of fairness. We started with correlation-based metrics like statistical parity, which can tell us what is happening but not why. We then introduced the language of causal models—specifically Directed Acyclic Graphs—to map out our assumptions about the world. Using this language, we learned to disentangle discrimination by analyzing direct and indirect pathways and finally arrived at the counterfactual question, which asks about fairness for a single individual.

The core message is that discrimination is a causal phenomenon, and ignoring this leads to superficial analysis. Causal models do not give us easy answers about what is fair. Instead, they provide a formal language to make our societal and ethical assumptions about fairness explicit, transparent, and debatable. They force us to graduate from simply measuring disparities to rigorously interrogating the mechanisms that create them.

By moving beyond 'what' and learning to ask 'why,' we can begin to build systems that are not just statistically balanced, but meaningfully fair.