SMS Part 5: The Relationship Between Risk Controls and Your Safety Risk Management System

In this article we will begin to look at how to use mitigations — or risk controls — to reduce the risk associated with aviation safety hazards.

Last year, Aviation Maintenance Magazine published a series of four articles explaining how to establish and use a safety risk management (SRM) system to identify aviation safety hazards and assess them for risk. The SRM is one of the key elements of a complete Safety Management System (SMS). This article assumes that you have some familiarity with the basic concepts of SMS that were covered in those articles If you do not, then we recommend that you go back and read those four articles (you can find all four on Aviation Maintenance Magazine’s website).

This year, we will guide you through the next steps of implementing an SMS system; and in this month’s article we will focus on basic concepts related to risk controls and how they relate to the work you did in recording your hazards and safety risk analyses.

Part of the SRM process for analyzing hazards — the process that we addressed in the past articles – involved assigning likelihood levels and consequence levels to each identified hazard. These help you to place risks on a likelihood-consequence matrix which in turn helps you to identify which hazards need to have their risk levels reduced. Based on this matrix, there are two ways to reduce the risk associated with a hazard. You can reduce the likelihood that the hazard will occur; or you can reduce the consequence of the hazard in the event it occurs.

These two concepts are not new to aviation. We’ve been using these concepts for years. For example, an air carrier’s required inspection items are items for which a second inspection is necessary for the work is complete. The second inspection provides a second opportunity for an independent inspector to look for flaws. This improves the likelihood that any existing flaws will be caught, which in turn decreases the likelihood that flaws exist in the work performed. This effort reduces the likelihood that the underlying hazard will occur (the hazard(s) for which the inspection was designed). Total risk, in this case, is reduced by reducing likelihood.

Another example can be found in the common practice of having duplicate or back-up systems where the systems are critical. Where there is an effective back-up system, the failure of the primary system will not lead to catastrophic results. This the consequence of a failure is mitigated through the design functions that permit a duplicate or back-up system to operate in the event of a primary system failure.

Note that where a system is critical and it is impractical to have a duplicate or back-up of the system, it is normal to impose life limits that are designed to remove parts that are subject to wear or degradation before they could reasonably fail. This effort to decrease likelihood of failure shows us that elements like practicality can be weighed to allow us to choose from more than one risk control, and we can sometimes choose from controls that improve our management of likelihood, consequence, or both in our efforts to reduce total risk.

Let’s apply these concepts to an example. Imagine a scenario where a repair station performs plating. One of the hazards associated with plating is hydrogen embrittlement. This should be recorded in the repair station’s database of hazards. Naturally, without any risk process controls, the likelihood of hydrogen embrittlement might be high. Hydrogen embrittlement can cause a component to fracture at stresses less than those typically associated with the expected strength of the metal. In other words, the metal is more brittle than expected which can lead to damage in the component. The potential safety consequence of such a hazard might be significant.

There are normal processes associated with common plating operations that are intended to reduce the likelihood of hydrogen embrittlement (such as heat treatment for thermal stress relief). The heat treatment adequately reduces the likelihood of the hydrogen embrittlement hazard, and this reduces the total risk associated with the hazard (typically reducing it to an acceptable level). Thus, heat treatment would be recorded as the risk control associated with the identified hazard of hydrogen embrittlement in your plating process.

Obviously, the risk control is valuable to prevent hydrogen embrittlement, but recording it in your hazard-risk-mitigation database has independent management value. If data shows later hydrogen embrittlement in plated components, this database allows you to focus on the risk controls that were intended to reduce that risk, and to analyze them for flaws.

It also allows you to use your hazard-risk-mitigation database to perform change management. For example, if the repair station deicides to replace the ovens used for heat treatment with new ovens, then the hazard-risk-mitigation database should show where those ovens are being used as hazard mitigations, and to permit the change management reviewers to ensure that the new ovens will be adequate to mitigate each risk for which the old ovens had been identified.

By changing the likelihood level, consequence level, or both, the system can effectively reduce risk posed by hazards. As we will see in future articles, this helps to drive an effective audit schedule as well as becoming an effective and objective change management tool. How do we select process controls that will effectively reduce likelihood, consequence, or both? Read our next article where we will discuss strategies for identifying and selecting risk controls.

Want to learn more? We have been teaching classes in SMS elements, and we have advised aviation companies in multiple sectors on the development of SMS processes and systems. Give us a call or send us an email if we can help you with your SMS questions.