The Failure Mode Effect Analysis

Should You use a Top Down or Bottom Up Approach?

A Failure Mode Effect Analysis (FMEA) or a Failure Mode Effect Criticality Analysis (FMECA) are great tools to help develop an effective maintenance strategy. These tools enable organizations to dive down to the failure mode level and develop preventive, predictive, or other tactics to prevent or mitigate failures.

However, there is a struggle that takes place with FMEAs of how much is enough detail and what is too much or too little. This often comes with experience and the level of risk an organization is willing to accept. The more detail, the more time, resources and effort the FMEA will take, even if the consequences are minimal.

Determining the level of detail can be determined based on the type of FMEA, and whether it is a Top-Down FMEA or a Bottom-Up FMEA.

What is a Top-Down FMEA?

A Top-Down FMEA is an approach that is focused more on the function of the system or asset. This approach looks at the function(s) of the system or asset and identifies the failures that would lead to a functional failure (i.e. the asset not doing what the user wants).

Let’s use a flashlight to demonstrate a Top Down approach. The function of the flashlight is to provide continuous light. So how might the flashlight fail?

– Battery dies due to normal use

– On/Off button fails due to damage

– Light bulb burns out

– Light bulb fails due physical impact

This will continue based on the experience of the group until the failure modes are identified. At the end of the analysis, the most probable failure modes have been identified and actions put in place to ensure the flashlight works.

What is a Bottom-Up FMEA?

A Bottom-Up FMEA is an approach that is focused on each individual component within the system or asset and how each component will fail. Let’s go back to the example of a flashlight. The same function exists, but the FMEA will be different. First we breakdown the flashlight to each individual component such as;

– battery spring

– battery

– battery cover

– switch

– housing

and this continues for all the various components that make up the flashlight. Once the components are listed, each component that has all failures listed such as;

– battery spring fails due to normal wear

– battery spring fails due to improper battery installation

– battery spring fails due to corrosion

– battery spring fails due to improper manufacturing

– battery fails due to normal discharge

– battery fails due to improper storage

– battery fails due to improper installation

– battery fails due to improper manufacturing.

As you can see the level of detail in a Bottom-Up approach is significantly more than the Top Down approach.

Which Approach is Right for You?

Determining the right approach comes with experience, but it also comes down to the level of effort the organization is willing to expend for the return. A Top-Down approach will typically uncover 70-80% of the failure modes, while a Bottom-Up approach will uncover 99% of the failure modes. But what is the severity or probability of those missed failure modes?

The missed failure modes with a Top Down approach would typically be added to the FMEA and maintenance strategy updated as they are uncovered. In this method, the analysis can be completed quickly and yield some good results.

While the Bottom Up approach would identify almost all risks, it comes with a significant resource requirement.

So which one do you choose? Well, it comes down to the potential consequences of the asset or system failure. A commercial aircraft will likely have a Bottom-Up approach, analyzing each rivet and bolt, as the consequences are significant, while a conveyor in a plant will likely have a Top-Down approach completed. Ultimately, it is up to the facilitator and organization to choose which method is right for them.

If you have any questions about FMEA/FEMCA and how it can help your organization, please feel free to contact me at jkovacevic@eruditio.com

I’m James Kovacevic

Principal Instructor at Eruditio

Where Education Meets Application

Follow @EruditioLLC

Follow @ReliableJames

Keith Fong says

December 16, 2021 at 11:23 AM

I don’t think this Top Down vs Bottom Up is an either-or situation. You have to work it from both sides because you will identify risks more comprehensively. You’ll see risks that aren’t addressed Top Down that are raised from the Bottom Up evaluation and vice versa.

Something I found disappointing is that failure modes and failure causes are subtly conflated in the flashlight example. Failure modes are about the function. In practice, there are not many failure modes: no function, insufficient function, excessive function, intermittent function, degraded function, premature function, delayed function.

If you assess the failure modes of the “provide continuous light” function, you’ll have these:
No function–no light output
Insufficient function–light output too dim
Excessive function–light output too bright (in reality, probably not a failure mode)
Intermittent function–light output intermittent
Degraded function–light output decreasing over time/use too quickly
Premature function–I don’t think this is applicable
Delayed function–I don’t think this is applicable

The conflation of failure mode and causes is where the question is posed “How might the flashlight fail?” The jump from function “provide continuous light” to causes including dead batteries and burnt out light bulb doesn’t acknowledge the failure mode. These are causes specifically for the “No Function” failure mode. None of the other failure modes are considered.

As it happens, I have a flashlight sitting next to me that should “provide continuous light” but is actually functioning intermittently. The batteries are charged, the light bulb is good, and the switch definitely turns it off and often turns it on. Shaking it can make it work sometimes.

Something not captured in the article is interface risks. As Carl Carlson points out frequently, at least half of failures are at interfaces. The way to do a systematic and comprehensive analysis is to use a Deductive Interface Matrix that Michael Anleitner discusses in his book “The Power of Deduction: Failure Modes and Effects Analysis for Design.” It is a surprisingly simple format, but it provides really useful and workable structure for the analysis.

Comments

Keith Fong says

December 16, 2021 at 11:23 AM

I don’t think this Top Down vs Bottom Up is an either-or situation. You have to work it from both sides because you will identify risks more comprehensively. You’ll see risks that aren’t addressed Top Down that are raised from the Bottom Up evaluation and vice versa.

Something I found disappointing is that failure modes and failure causes are subtly conflated in the flashlight example. Failure modes are about the function. In practice, there are not many failure modes: no function, insufficient function, excessive function, intermittent function, degraded function, premature function, delayed function.

If you assess the failure modes of the “provide continuous light” function, you’ll have these:
No function–no light output
Insufficient function–light output too dim
Excessive function–light output too bright (in reality, probably not a failure mode)
Intermittent function–light output intermittent
Degraded function–light output decreasing over time/use too quickly
Premature function–I don’t think this is applicable
Delayed function–I don’t think this is applicable

The conflation of failure mode and causes is where the question is posed “How might the flashlight fail?” The jump from function “provide continuous light” to causes including dead batteries and burnt out light bulb doesn’t acknowledge the failure mode. These are causes specifically for the “No Function” failure mode. None of the other failure modes are considered.

As it happens, I have a flashlight sitting next to me that should “provide continuous light” but is actually functioning intermittently. The batteries are charged, the light bulb is good, and the switch definitely turns it off and often turns it on. Shaking it can make it work sometimes.

Something not captured in the article is interface risks. As Carl Carlson points out frequently, at least half of failures are at interfaces. The way to do a systematic and comprehensive analysis is to use a Deductive Interface Matrix that Michael Anleitner discusses in his book “The Power of Deduction: Failure Modes and Effects Analysis for Design.” It is a surprisingly simple format, but it provides really useful and workable structure for the analysis.

Nurettin says

December 19, 2021 at 11:49 PM

I totaly agree with commebt concerning both approach shoşud be taken into consideration while evaluation safety and function critical systems and item in order to evaluate all possibities. Birefly only one approach may not be good enough to assess every aspect.

Should You use a Top Down or Bottom Up Approach?

What is a Top-Down FMEA?

What is a Bottom-Up FMEA?

Which Approach is Right for You?

About James Kovacevic

Comments

Leave a Reply Cancel reply