Systems Over Blame: Why Asking 'What Did the System Allow?' Produces Better Learning Than 'Who Messed Up?'

The instinct after a failure is to find the person responsible. The instinct is wrong. People operate within systems, and the system's design determines which errors are possible, which are likely, and which are inevitable.

8 min read · for the tool Failure Autopsy

A critical client deliverable goes out with a significant error — wrong figures in the executive summary. The immediate response is predictable: who was responsible? Who approved it? Whose name was on the final sign-off? The person is identified, the conversation is uncomfortable, and the conclusion is clear: they need to be more careful. A note is added to the process: “double-check all figures before sending.”

Three months later, a similar error occurs — different person, different client, same type of mistake. The investigation follows the same pattern. A person is blamed. A reminder is issued. Nothing structural changes. The system that produced the error — the time pressure, the unclear ownership, the absence of a verification step in the workflow — remains intact, quietly waiting to produce the next failure.

The research

Sidney Dekker, a professor of human factors and system safety, articulated the fundamental problem with person-centred failure analysis in The Field Guide to Understanding Human Error (2006). He distinguished between the “old view” of human error — which treats errors as the cause of failures, originating in individual carelessness, incompetence, or negligence — and the “new view,” which treats errors as symptoms of deeper systemic issues. In the new view, asking “who made the mistake?” is the wrong starting question. The right question is “what conditions made this mistake possible, likely, or inevitable?”

Dekker’s analysis draws on decades of accident investigation in aviation, healthcare, and nuclear power — domains where the consequences of failure are catastrophic and the incentive to understand root causes is therefore highest. In every case, he found the same pattern: failures were preceded by a chain of conditions — time pressure, ambiguous procedures, conflicting priorities, normalised deviance — that made the eventual error almost inevitable. The person at the sharp end of the chain — the pilot, the surgeon, the operator — was the last link, not the cause.

James Reason, a psychologist at the University of Manchester, developed the “Swiss cheese model” of system failure in his 1990 book Human Error. In Reason’s model, every system has multiple layers of defence against failure — procedures, checks, training, supervision. Each layer has holes — weaknesses, gaps, latent conditions. A failure reaches the end user only when the holes in multiple layers align, allowing an error to pass through every defence simultaneously. The question after a failure is not “who put the hole in the cheese?” It’s “why were the holes aligned?”

This reframing is not a softening of accountability. It’s a redirection of analytical effort toward the factors that produce systemic change. If the error was caused by a person’s carelessness, the fix is “that person should be more careful” — a fix that depends on sustained individual vigilance and provides no structural improvement. If the error was caused by time pressure, ambiguous handoff protocols, and the absence of a verification step, the fix is to restructure the workflow — a fix that prevents the error regardless of who is performing the task.

The mechanism

Amy Edmondson, at Harvard Business School, published a study in Administrative Science Quarterly in 1999 that connected failure analysis directly to team learning. She found that teams with higher safety to speak up — the belief that one can speak up without fear of punishment — reported more errors. Not because they made more errors, but because they were willing to surface them. Teams with low safety to speak up suppressed error reports, which meant the same mistakes recurred because the organisation never learned about them.

Blame-centred failure analysis directly undermines safety to speak up. When the response to an error is to identify and punish the individual responsible, the rational response for every other team member is to hide their own errors, avoid reporting near-misses, and resist the transparency that organisational learning requires. The blame produces exactly one outcome — a chastened individual — while destroying the conditions necessary for dozens of future errors to be prevented.

Charles Perrow, in Normal Accidents (1984), took the argument further, arguing that in complex, tightly coupled systems, accidents are not aberrations — they are inevitable features of the system’s design. When components are interdependent and processes are time-sensitive, small failures cascade in unpredictable ways. Blaming an individual for a cascade failure is like blaming a specific domino for falling — technically accurate, structurally meaningless.

Peter Senge, in The Fifth Discipline (1990), framed the distinction as a difference between “event thinking” and “systems thinking.” Event thinking identifies discrete causes for discrete effects: this person made this mistake, causing this outcome. Systems thinking identifies patterns, feedback loops, and structural conditions that produce classes of outcomes. Event thinking produces blame. Systems thinking produces design changes.

The question “who messed up?” produces one answer: a name. The question “what did the system allow?” produces something far more valuable: a list of things you can actually change.

The practical implications

The three-line structure forces systems-level thinking. “What happened (facts only, no blame)” strips the narrative of emotional charge and anchors the analysis in observable events. “What the system or process allowed to go wrong” redirects attention from the person to the conditions — the handoff that was unclear, the check that was missing, the deadline that compressed quality assurance. “One specific thing I’d change” prevents the analysis from becoming an abstract catalogue of systemic issues and focuses it on the single highest-impact intervention.

The one-change rule is deliberate. The temptation after a failure is to redesign everything — add ten new checks, rewrite the entire process, implement a new tool. This impulse feels thorough but usually produces process bloat without proportional improvement. Dekker’s research shows that the most effective post-failure interventions are targeted: identify the single condition that contributed most to this specific failure, and change that condition. If a similar failure occurs again, the next autopsy will surface the next-most-important condition.

Timing matters. When the wound is raw — when emotions are high, when blame-seeking is active, when the person who made the error is still in distress — the autopsy will be contaminated by defensiveness and recrimination. Allowing a cooling period of 24 to 48 hours before conducting the analysis produces more honest, more systemic, and more usable findings. The delay gives analysis time to return after the emotional response has had its say.

The bigger picture

Blame is satisfying in a way that systems analysis is not. It provides a clean narrative — person, mistake, consequence — that resolves the uncomfortable ambiguity of failure. Systems analysis produces messier answers: multiple contributing factors, shared responsibility, structural conditions that implicate the organisation rather than an individual. This messiness is harder to communicate, harder to act on, and harder to feel good about. It is also the only analysis that produces lasting change.

Organisations that learn from failure — that genuinely improve rather than cycling through the same errors with different people — have made a specific structural commitment: they’ve decided that understanding why something went wrong matters more than identifying who to blame. This commitment is not natural. It runs against deep psychological impulses toward attribution and accountability. It requires deliberate, sustained effort to maintain.

The failure autopsy is the individual practice version of this organisational commitment. Three lines. No blame. One change. It won’t transform a blame culture on its own. But it will transform the quality of learning you extract from every setback — and over time, the person who learns from systems rather than from scapegoats builds a fundamentally different, and fundamentally more effective, approach to everything that goes wrong.

References

  1. Dekker, S. (2006). The Field Guide to Understanding Human Error. Ashgate Publishing.
  2. Reason, J. (1990). Human Error. Cambridge University Press.
  3. Edmondson, A. (1999). Safety to speak up and learning behavior in work teams. Administrative Science Quarterly, 44(2), 350–383.
  4. Perrow, C. (1984). Normal Accidents: Living with High-Risk Technologies. Basic Books.
  5. Senge, P. M. (1990). The Fifth Discipline: The Art and Practice of the Learning Organization. Doubleday.