Safety-I and Safety-II

(c) Erik Hollnagel, 2020

SafetySynthesis facets: Safety-I and Safety-II

Safety is traditionally defined as a condition where the number of adverse outcomes is as low as possible (Safety-I). From a Safety-I perspective, the purpose of safety management is to make sure that the number of accidents and incidents is kept as low as possible, or as low as is reasonably practicable. This means that safety management starts from the manifestations of the absence of safety and that - paradoxically - safety is measured by counting the number of cases where it fails rather than by the number of cases where it succeeds. It also means that safety is studied in situations where there clearly has been a lack of safety. (All other sciences study situations where the central phenomenon is present, not where it is absent.) This unavoidably leads to a reactive approach based on responding to what goes wrong or what is identified as a risk - as something that could go wrong.

Focusing on what goes well, rather than on what goes wrong, changes the definition of safety from ‘avoiding that something goes wrong’ to ‘ensuring that everything goes well’. More precisely, Safety-II is a condition where the number of intended and acceptable outcomes (meaning everyday work) is as high as possible. It is the ability to succeed under varying conditions. From a Safety-II perspective, the purpose of safety management is to ensure that as much as possible goes well, in the sense that everyday work achieves its objectives. This means that safety is managed by what it achieves (acceptable outcomes, things that go well), and that likewise it is measured by counting the number of cases where things go well. In order to do this, safety management cannot only be reactive, it must also be proactive. But it must be proactive with regard to how actions go well, to everyday acceptable performance, rather than with regard to how they can fail, as traditional risk analysis does.

A short note on Safety-I and Safety-II can be downloaded here.

The Pedigree of Safety-I and Safety-II

The first public description of the S–I, S–II distinction coincided with the launch of the website for the Resilient Health Care Net on 18 August 2011. It was followed soon after by an article in the programme note for Sikkerhetsdagene, a safety conference in Trondheim, Norway, on 10–11 October. Prior to that, the first formal description was in a proposal for a research agenda, submitted June 8, 2011. (The proposal, which is in Danish, can be found here.)

The idea of contrasting two approaches to safety was itself inspired by a similar debate that took place within the field of Human Reliability Assessment (HRA). In 1990 the Human Reliability Analysis (HRA) community was seriously shaken by a concise exposure of the lack of substance in the commonly used HRA approaches (Dougherty, E.M. Jr. (1990). Human Reliability Analysis – where shouldst thou turn? Reliability Engineering and System Safety, 29, 283–99). The article made it clear that HRA needed a change, and emphasised that by making a distinction between the current approach, called first-generation HRA, and the needed replacement, called second-generation HRA.

Another well-known use of this rhetorical device is the juxtaposition of Theory X and Theory Y in Douglas McGregor’s 1960 book The Human Side of Enterprise. The juxtaposition was used to encapsulate a fundamental distinction between two different management styles (authoritarian and participative, respectively), which turned out to be very influential. And there are, of course, even more famous examples, such as Galileo’s Dialogue Concerning the Two Chief World Systems and the philosophical dialogue in the works of Plato.

Precursors

The essence of Safety-II is the idea that we should focus on what happens and on how work is done, rather than on 'errors' and how something can go wrong. This is not new. In my own time, the first discussion of that can be found in a technical note from 1983. The discussion focused on the role of 'human error' man-machine interaction (which now is called human-machine interaction). The main points were the following:

(There is no) the need for a specific theory of “Human Error”, since the observed discrepancies instead can be explained by referring to, for instance, a performance theory. That may furthermore have the virtue of focusing on the situation and context in which the MMS must function, and the interaction between its inherent characteristics and the environmental constraints.

Consequently, I do not think that there can be a specific theory of “Human Error”, nor that there is any need for it, This is not because each error, as a “something” requiring an explanation, is unique, but precisely because it is not, i.e., because it is one out of several possible causes. Instead we should develop a theory of human action, including a theory of decision making, which may be used as a basis for explaining any observed mismatch. A theory of action must include an account of performance variability, and by that also the cases of where “Human Error” is invoked as a cause.

... we must be concerned with the mechanisms that are behind normal action. If we are going to use the term psychological mechanisms at all, we should refer to “faults” in the functioning of psychological mechanisms rather than “error producing mechanisms”. We must not forget that in a theory of action, the very same mechanisms must also account for the correct performance which is the rule rather than the exception. Inventing separate mechanisms for every single kind of “Human Error” may be great fun, but is not very sensible from a scientific point of view.

To conclude, a theory of error must be a theory of the interaction between human performance variability and the situational constraints. (Emphasis added.)

It has been suggested that the Safety-II ideas can be found also in Cook, Woods & Miller (1998). This document, which is a report from an early workshop on patient safety, highlighted the important relationship between 'first stories' and 'second stories'. The 'second story' captures how the system usually works to manage risks but sometimes fails. It looks at:

"the multiple subtle vulnerabilities of the larger system which contribute to failures, detecting the adaptations human practitioners develop to try to cope with or guard against these vulnerabilities, and capturing the ways in which success and failure are closely related."

The emphasis on understanding how the system "works to manages risks" is, however, clearly a Safety-I perspective. The report as a whole focused on what health care could learn from the experience of other safety critical industries, without defining the meaning of safety. There was obviously no need for that, since it was tacitly accepted, then as now, that safety was the 'freedom from risks and harm'.

Looking for contrasts or looking for nuances?

A Safety-I perspective is binary and based on dichotomies or contrasts, of which the most glaring is safe versus unsafe. While it may be cognitively convenient (for some, at least) to rely on such a simplified world view, it is not very useful in practice. There are, of course, many situations or conditions that usefully can be categorised using a binary distinction, but there are far more that cannot. A Safety-II perspective recognises the nuances and acknowledges that both events and outcomes are better described in terms of continua - as continuous distributions rather than as discrete events. The number of ways in which something can go well is far larger than the number of ways in which it can fail.

References

Cook, R. I., Woods, D. D. & Miller, C. (1998). A Tale of Two Stories: Contrasting Views of Patient Safety. Report from a Workshop on Assembling the Scientific Basis for Progress on Patient Safety. National Patient Safety Foundation at the AMA.

Hollnagel, E. (1983). Position paper on human error. Responses to Queries from the Program Committee. NATO Conference on Human Error, Bellagio, Italy, September 5-9.