(c) Erik Hollnagel, 2020
The concept of the Resilience Assessment Grid has been revised and developed into the concept of the Systemic Potentials Management (SPM). The SPM is probably not the last stage in this development.
The main consequence of this revision is that that four potentials (to respond, to monitor, to learn, and to anticipate) now are seen in a wider context than just resilience. They are rather the basic functions, capabilities, or potentials that a living system must have. This based on an idea proposed by Donald MacKay in 1956, of which more may be said later.
For the time being a presentation of the SPM can be found here.
This page provides a brief overview of the RAG. The concept of the Resilience Assessment Grid has now been replaced by the concept of the Systemic Potentials Management describeds above.
Introduction
A system is said to perform in a manner that is resilient when it can sustain required operations under both expected and unexpected conditions by adjusting its functioning prior to, during, or following events (changes, disturbances, and opportunities). Whereas current safety management (Safety-I) focuses on reducing the number of adverse outcomes by preventing adverse events, Resilience Engineering (RE) looks for ways to enhance the ability of systems to succeed under varying conditions (Safety-II). It is therefore necessary to understand what this ability really means, since it clearly is not satisfactory just to call it ‘resilience’.
The Four Basic Potentials for Resilient Performance
The definition of resilient performance can be made more concrete by considering what makes resilient performance possible. Since resilient performance is possible for most, if not all, systems, the explanation must refer to something that is independent of any specific domain. Resilience engineering has proposed the following four basic potentials:
The Interdependence of the Potentials
The four potentials are clearly not independent of each other. For example, the potential to respond can benefit from, and perhaps even requires, the potential to monitor. Similarly, the potential to learn is likewise needed to improve the potentials to monitor and to respond. The four potentials can be seen as functions, and understanding how these functions are coupled is obviously essential for managing them. This will in each specific case require a description of the interdependence of the potentials that considers the nature of the activities and the operating conditions. Since the potentials can be seen as functions, it is possible to use the FRAM to do that.
Assessing the Potentials for Resilient Performance
Since resilience refers to to something that the system does rather than to something that the system has, it is not meaningful to propose a single or simple ‘measurement of resilience’ or to refer to 'levels of resilience'. But it is possible to consider the extent to which each of the four potentials that provide the basis for resilient performance are present in or supported by the system. The RAG uses four sets of questions to determine how well a system performs on each of the four basic potentials. Each question is answered using a Likert-type scale, and taken together the answers provide a profile of the system's potentials for resilient performance. (The questions can of course be answered using other sociometric techniques.) Resilience engineering proposes a set of generic questions for each potential. These questions must however be tailored to the specific application before being used.
The RAG profile, the answers to the four sets of questions, does not provide an absolute rating of a system's potentials for resilient performance. But several RAG profiles can be compared to look for differences, that in turn can be used as the basis for managing the system and following the consequences of planned interventions. The differences are easy to see if the asnwers are rendered graphically using, e.g., a radar chart.
One way of doing that is to use the RAG repeatedly for the same group of respondents, to see if there are any changes in the answers they give. A (fictive) example is provided by the two profiles (for the ability to learn) shown below. The radar chart makes it easy to see where changes have happened, and also to decide where changes should happen.
Practical guidance for using the RAG
There are five important points to remember when using the RAG.
The RAG in practice
The RAG has been used in practice in a number of cases in, e.g., railways, off-shore, health care, and radiation protection. While there is no comprehensive list of practical examples, a Google Scholar search shows at least some of them. While interested readers are encouraged to do a search themselves (and share the results with me if possible), some examples are provided here:
Rigaud, E. et al. Proposition of an organisational resilience assessment framework dedicated to railway traffic management
Aaen-Stockdale, C. (2014). Oil and gas, technology and humans: assessing the human factors of technological change. Ergonomics, 57(6), 956-957.
Ose, G. O., Ramstad, L. S., Steiro, T. J., & MARINTEK, T. Analysis of Resilience in Offshore Logistics and Emergency Response Using a Theoretically Based Tool.
Apneseth, K. (2010). Resilience in integrated planning. M.Sc. Thesis, Norwegian University of Science and Technology.
The Australian Radiation Protection and Nuclear Safety Agency (ARPANSA) has included the RAG as part of their holistic safety guidelines.
Ljunberg, D. & Lundh, V. (2013). Resilience Engineering within ATM - Development, adaption, and application of the Resilience Analysis Grid (RAG). University of Linköping, LiU-ITN-TEK-G--013/080--SE.
(Last update 2021-10-25. To be continued ...)
According to the conventional interpretation of safety, here called Safety-I, safety denotes a condition where as little as possible goes wrong, the focus of practical efforts whether in management or analysis is therefore on the occurrence of unacceptable outcomes and on how to reduce their number to an acceptable level, ideally zero and the emphasis is on how to manage safety eo ipso, as seen in the ubiquitous safety management Systems (SMS).
This approach, however leads to somewhat of a paradox since Safety in this way is defined and measured more by its absence than by its presence, as noted by Reason, (2000). According to a Safety-I perspective an accident thus represents a situation or a condition where there is or was a lack of safety. Which immediately raises the obvious question of how it is possible to learn about something if it only is studied in situations where it is not there?No known sciences can do that-- except safety science!!! And furthermore how is it possible to manage something that is not there? The simple answer is that it is impossible! THE UNACCEPTABLE OUTCOMES THAT SAFETY MANAGEMENT FOCUS ON ARE THE RESULTS OF SOMETHING THAT HAPPENED IN THE PAST,BUT DOES NOT HAPPEN ANY LONGER IT CAN THEREFORE NOT BE MANAGED!!!-- While you can manage a process you cannot manage a product.These paradox fortunately disappears in the view proposed by Safety-II, where safety is defined as a condition where as much as possible goes well. An acceptable outcome therefore represents conditions where safety is present rather than absent, and efforts are accordingly directed at understanding how this happens and how one can ensure that it will happen also in the future. Logically, if as much as possible goes well, then as little as possible goes wrong,since in practice something cannot go well and go wrong at the same time. A Safety-II approach therefore achieves the same objective as a Safety-I approach, but does so in a completely different way. In Safety-II the concern is not to manage safety as a static outcome, hence using safety as a noun but to manage system performance safely, as a dynamic process, hence safely as an adverb. There is a crucial difference between managing safety and managing safely. The former represents a cost, since the purpose is to avoid something rather than to achieve something, while the latter represents an investment that directly contributes to productivity as well as increased revenue. It is therefore clearly more important and useful for a company to manage safely than to manage safety.
Since most work and most activities in practice go well, even though we fail to pay attention to them there will also be more cases to study sand learn from. Best of all, perhaps is that there is no need to wait for something to happen, i.e., to fail or go wrong. Something is happening all the time all we need to do is to pay attention to it
Reason, J. (2000). Safety paradoxes and safety culture. Injury Control & Safety Promotion, 7(1), 3-14.