Understanding FMEA Severity Risk – Part 1

The Seriousness of Consequences

Have you ever been in an FMEA meeting where the team did not agree on the severity rating? Understanding and correctly applying severity risk is an important part of FMEA application. This article discusses the subject of severity risk, including examples for design and process FMEAs, and offers a tip on what to do when the team does not agree on the severity risk rating.

“The only real mistake is the one from which we learn nothing.”
Henry Ford

Definition of “severity”

The Macmillan dictionary defines “severity” as “the seriousness of something bad or unpleasant.”

What is the definition of “Severity” in an FMEA?

“Severity” is a ranking number associated with the most serious effect for a given failure mode, based on the criteria from a severity scale. It is a relative ranking within the scope of the specific FMEA and is determined without regard to the likelihood of occurrence or detection..

How is “Severity” assessed in FMEAs?

Having identified the most serious effect for the failure mode, the FMEA team assesses the severity ranking. This is the severity of the effect of the failure mode, not the severity of the failure mode itself.

Using the agreed-upon severity scale, the team carefully reviews the criteria column to make this judgment. If the effect is well defined, the severity is easily established by reviewing the severity scale criteria.

For Design FMEAs, the team assesses the severity of the end effect at system or end user.

For Process FMEAs, the team should consider the effect of the failure at the manufacturing or assembly level, as well as at the system or end user. The severity that is used in the Process FMEA is the highest of the two values.

What does a Severity Scale look like for Design FMEAs?

The following is an example of a severity scale for Design FMEAs. It is based on “Potential Failure Mode and Effects Analysis (FMEA) 4th Edition, 2008 Manual.”

What does a Severity Scale look like for Process FMEAs?

The following is an example of a severity scale for Process FMEAs. It is based on “Potential Failure Mode and Effects Analysis (FMEA) 4th Edition, 2008 Manual.”

What is an example of Severity in a Design FMEA?

[In this fictitious example, the Design FMEA team considers the severity of the end effect, using the criteria in the AIAG 4 severity scale, and enters in the FMEA worksheet.]

Item: Power steering pump

Function: Delivers hydraulic power for steering by transforming oil pressure at inlet ([xx] psi) into higher oil pressure at outlet [yy] psi during engine idle speed

Failure Mode: Inadequate outlet pressure (less than [yy] psi)

Effect (Local: Pump): Low pressure fluid goes to steering gear
Effect (Next level: Steering Subsystem): Increased friction at steering gear
Effect (End user): Increased steering effort with potential accident during steering maneuvers

Severity: 10

What is an example of Severity in a Process FMEA?

[In this fictitious example, the Process FMEA team considers the severity of product/customer effect along with severity of mfg/assy effect, using the criteria in the AIAG 4 severity scale, and enters the worst case in the FMEA worksheet.]

Process Step: Induction harden shafts using induction hardening machine

Function: Induction harden shafts using induction-hardening machine ABC, with minimum hardness Brinell Hardness Number (BHN) “X”, according to specification #123.

Failure Mode: Shaft hardness less than BHN “X”

Effect (In plant): 100% scrap
Effect (End user): Shaft fractures with complete loss of performance
Effect (Assembly): Not noticeable during assembly

Severity: (Customer Effect): 8 (loss of primary function)
Severity: (Mfg/Assy Effect): 8 (major disruption)
Severity: 8 (entered in FMEA worksheet)

Application Tip

Tip 1: In the case of items that are redundant, and there is no detection or no warning that a redundant item has failed, the severity should be assessed as if all of the redundant items have failed.

Tip 2: If the effect is well defined, the severity is easily established by reviewing the severity scale criteria. Difficulty identifying the severity ranking is usually due to an improperly defined effect or inadequate severity scale criteria.

Is action always required on high-severity issues? What if severity is high (9 or 10 on a severity scale of 1 to 10), and the occurrence and detection rankings are both low? Is action still required? This problem, as well as a challenging problem involving fail-safe strategies, are the subject of the next problem-solution article.

About Carl S. Carlson

Carl S. Carlson is a consultant and instructor in the areas of FMEA, reliability program planning and other reliability engineering disciplines, supporting over one hundred clients from a wide cross-section of industries. He has 35 years of experience in reliability testing, engineering, and management positions, including senior consultant with ReliaSoft Corporation, and senior manager for the Advanced Reliability Group at General Motors.

« Enabling People, Processes and Product Development

The Army Memo to Stop Using Mil HDBK 217 »

Comments

prabha says

November 28, 2018 at 9:56 PM

I happen to give a training on the risk management process. Have included detectability although its not recommended in the ISO 14971. Have a query-
The objective of implementing the control measures is to increase the detectability and decrease the occurence. Even the severity too is reduced. But i heard from few QA professionals that the ‘severity cannot be changed” the value of severity remains the same after implementing the control measures. Is this right? if the severity value was 7 before mitigation, it would remain 7 after mitigation too. My thoughts on this are different. According to me the severity and occurrence both are reduced after your control measures are put in place. It depends on the type of control measures that we use. For eg. handling of clinical samples after wearing gloves reduces the severity of the harm.
Kindly confirm. If possible with an example

Reply
- Carl Carlson says
  
  November 29, 2018 at 6:08 PM
  
  Hello Prabha,
  
  I get this question about whether or not severity risk can be reduced quite often. The short answer is “yes,” under certain circumstances. When it is possible, it requires a system design change.
  
  There are four strategies to reduce severity risk that I outline in chapter 7 of my book, Effective FMEAs.
  
  Quoting from my book:
  
  “Design for fail-safe
  
  “A fail-safe design is one that, in the event of failure, responds in a way that will cause minimal harm to other devices or danger to personnel. Fail-safe does not mean that failure is improbable; rather that a system’s design mitigates any unsafe consequences of failure. In FMEA language, fail-safe reduces the severity of the effect to a level that is safe.
  
  “Design for fault-tolerance
  
  “A fault-tolerant design is a design that enables a system to continue operation, possibly at a reduced level (also known as graceful degradation), rather than failing completely, when some part of the system fails. In FMEA language, fault-tolerance reduces the severity of the effect to a level that is consistent with performance degradation.
  
  “Design for redundancy
  
  A redundant design provides for the duplication of critical components of a system with the intention of increasing reliability of the system, usually in the case of a backup or fail-safe. This means having backup components that automatically “kick in” should one component fail. In FMEA language, redundant design can reduce the occurrence of system failure and reduce system severity to a safe level. This strategy can be employed to address single-point failures.
  
  “Provide early warning
  
  “Failures that occur without warning are more dangerous than failures with warning. Catastrophic effects can be avoided by adding a warning device to system design. In FMEA language, adding early warning reduces the severity of the effect, potentially reduces the occurrence of system failure, and increases likelihood of detection of failure mode/cause during in-service usage.”
  
  An example of reducing severity risk using “Fail Safe” is laminated safety glass for windshields prevents injury from glass shards. An example of reducing severity risk through Fault Tolerance is “run flat” tires. A passenger car can have “run-flat” tires, each of which contain a solid rubber core, allowing their use even if a tire is punctured. The punctured “run-flat” tire is effective for a limited time at a reduced speed.
  
  Hope that helps. Feel free to ask any follow-up questions.
  
  Carl
  
  Reply
Ravi.daram says

January 15, 2019 at 7:55 AM

Sir,can I get the formula for reducing the occurance ranking for redundant systems and also suggest how to reduce the severity ranking in case of reduncy .

Reply
Carl Carlson says

January 18, 2019 at 9:52 PM

Hello Ravi,

Excellent questions.

I’ll begin by quoting an excerpt from chapter 7 of my book, Effective FMEAs.

“A redundant design provides for the duplication of critical components of a system with the intention of increasing reliability of the system, usually in the case of a backup or fail-safe. This means having backup components that automatically “kick in” should one component fail. In FMEA language, redundant design can reduce the occurrence of system failure and reduce system severity to a safe level.”

I’ll reply to your questions in two parts.

First, you asked for the formula for reducing the occurrence ranking for redundant systems. This depends on the system configuration. I’ll illustrate with a hypothetical example. If component A has a failure rate of 1 in 500, then according to the AIAG v4 scale, this could be assessed an occurrence ranking of 6. If the system configuration is changed to add a second component A in parallel (redundant), then the likelihood that both components fail is (1/500) x (1/500) or 1 in 250,000. Using the same AIAG v4 scale, the team could assess an occurrence ranking of 1 or 2.

Second, you asked how to reduce the severity ranking in the case of redundancy. Using the hypothetical example of a safety-critical component A, we can assume that failure would lead to an effect that is safety related. Using the AIAG v4 scale, this would be assessed as a severity 9 or 10. If the system configuration is changed to add a second component A in parallel (redundant), then the severity of effect when component A fails, at the system level, is reduced to a 7 or 8. There may be a degradation of primary function, but the system is safe.

Please feel free to ask any follow-up questions.

Carl

Reply
Prabhu says

June 12, 2019 at 4:46 PM

Sir
Can I get example for S,O&D for RPN calculation.if calculation is almost wrong, what will be the effect in part.

Reply
- Carl Carlson says
  
  July 22, 2019 at 9:50 AM
  
  Prabhu,
  
  Here is an “Inside FMEA” article called “Understanding how to prioritize risk for corrective actions in an FMEA.” It includes an example RPN calculation, as well as alternatives to RPN.
  
  https://lucas-accendo-site-speed.sprod01.rmkr.net/understanding-prioritize-risk-corrective-actions-fmea/#more-199939
  
  You also asked, “If calculation is almost wrong, what will be the effect in part.” I am not certain that I understand the question. Are you asking what is the impact on the item being analyzed with FMEA if the RPN calculation is incorrect? I would appreciate any clarification that you can provide.
  
  Thanks.
  
  Carl
  
  Reply
Kris says

September 26, 2019 at 7:50 AM

Hi Carl I like your way of thinking very much, maybe you can also help me with concluding the following case:
Imagine you have Safety related components (already decomposed ASIL A (B)) as a matter of fact decomposition is only valid when decomposed elements are suffciently redundant so failure of one component sall be mitigated. (on the other hand backup solution my be hard to argue as “confort function” or “degradation”)

The question is would it be possible (and basically industry accepted) to lower severity having still offcially safety related component?

I deal with some quality guys that are blind to my arguments thus maybe if you know any publication, book or norm that makes them to accept would help very much.

thanks,
Kris

Reply
- Carl Carlson says
  
  September 27, 2019 at 1:16 PM
  
  Hi Kris,
  
  Thanks for your question about safety-related components.
  
  I’ll begin with a few high-level comments and then answer your question.
  
  There are two bodies of knowledge that are relevant to this conversation, and both are important.
  
  One body of knowledge is regulatory and standards. Companies must meet applicable regulatory requirements.
  
  The second body of knowledge is quality and reliability methods, such as FMEA. Companies must develop products that are safe and reliable.
  Good companies integrate these two bodies of knowledge, and the set of corresponding requirements, into their product development process.
  
  Now to your question, “would it be possible (and basically industry accepted) to lower severity having still officially safety-related component.”
  
  My reply is two-fold. First, you always want to inquire about safety procedures within your company. Every company has their own set of safety procedures.
  
  Second, there is a new type of FMEA called FMEA-MSR. MSR stands for Monitoring and System Response. This new FMEA is actually a supplement to a DFMEA. FMEA-MSR will be described in the SAE J1739 update version that is due to be finished at the end of this year, and will be available early next year. It is also described in the new AIAG/VDA Handbook. I prefer the SAE J1739 version soon to be out, as I’m on that team.
  
  In the FMEA-MSR supplemental FMEA, as a supplement to the DFMEA, the team can consider a monitoring and response protocol that can lower the severity in an FMEA. It should meet the rigor of both bodies of knowledge.
  
  Hope that helps.
  
  Carl
  
  Reply
arvin caberte says

October 18, 2019 at 2:55 AM

Please help. Do you have a scale rating system for Severity , Occurrences and Detection that can be use for transaction/service processes ? Sample of this is on Lending , Insurance and so on. Please share. Thank you.

Reply
- Carl Carlson says
  
  October 18, 2019 at 3:14 AM
  
  Hello Arvin,
  
  Thank you for your interesting question. In my book, I talk about a type of FMEA called “Business Process FMEA.” I define this type of FMEA as follows:
  
  “Business Process FMEA focuses on the steps of a business process, and on how to minimize inefficiencies by improving workflow, organizational management, and decision-making. It follows a similar format to a Process FMEA, with the exception that the steps of the business process replace the operations of the manufacturing or assembly process.”
  
  This type of FMEA would apply to business transaction/service processes. However, I do not know of any published scales for severity, occurrence or detection. They would have to be developed for the specific application. I provide guidance on scales in chapter 5 of my book.
  
  “If the risk ranking scales are not mandated and the team has the flexibility to establish their own risk ranking scales, there is a simple rule to follow: use the minimum number of ranking levels for each scale that adequately differentiates the risk criteria. In other words, if the team can manage with five ranking levels and the needed differentiation of risk for a given application is adequately defined, then use ranking scales with five levels. If ten ranking levels are needed to adequately differentiate and define the risk, use scales that have ten ranking levels. It is worthwhile to spend the time needed to define properly the scales with the correct resolution and criteria. Using scales that have too many ranking levels for a given application can result in the FMEA team spending excessive time deciding which level on the scale represents the risk without adding value. Using scales with too few ranking levels can result in the FMEA team missing important risk differentiation.”
  
  Hope this helps.
  
  Carl
  
  Reply
Tiku Patel says

November 19, 2019 at 4:02 PM

Hello Carl,

I enjoyed reading different mitigation levels to reduce a risk level as explained above. For process FMEAs, to assign a correct severity rating, do we assume the failure mode reaching the patients, government regulations or stop at the product level? For manufacturing Process FMEA, if a CQA is impacted, would you assign a severity rating of 10 or assume that due to release testing, the failure effect will always will be detected prior to release?

Additionally, when we talk about fail-safe and fault tolerant controls, do we associate them with detection or prevention? For severity rating of 10 or 9 as described in this article, a severity rating of 9 is appropriate with a warning (detection)prior to harm and severity rating of 10 is assigned without warning. In either cases, we don’t talk about any prevention, yet we lower the severity rating? I am in a situation where I may have warning (detection or release test available) but cannot be prevented or corrected when product quality is compromised to fail the product specs. In that case, what do you assign as the correct rating?

Would appreciate your thoughts with few examples.

Reply
- Carl Carlson says
  
  November 21, 2019 at 12:27 PM
  
  Hello Tiku,
  
  You have a series of excellent questions. I think the best way to reply is line by line, as below. Please feel free to ask any follow-up questions.
  
  Thanks.
  
  Carl
  ___________________________
  
  You asked: “For Process FMEAs, to assign a correct severity rating, do we assume the failure mode reaching the patients, government regulations or stop at the product level?”
  
  My reply: For both Design and Process FMEAs, how far to carry the effect of the failure mode is often set by company policy. In theory, Design FMEA takes the effect to the system or end user. In many medical companies, the DFMEA is guided by company policy and takes the effect only to the level of the device, with other activities/methods considering patient harm. In theory, Process FMEA considers both the impact on the manufacturing process as well as on the product. In many medical companies, the PFMEA takes the effect to the level of the manufacturing facility as well as the device, and other activities/methods are employed to consider consequences that potentially impact the patient.
  
  You asked: “For manufacturing Process FMEA, if a CQA is impacted, would you assign a severity rating of 10 or assume that due to release testing, the failure effect will always will be detected prior to release?”
  
  My reply:
  
  The assessment for severity, when CQA is impacted, depends on company policy and the exact nature of the CQA, as well as the specific severity scale that is being used. One other point to make: according to FMEA standards, “severity is a relative ranking within the scope of the individual FMEA and is determined without regard for occurrence or detection.”
  
  You asked: “Additionally, when we talk about fail-safe and fault tolerant controls, do we associate them with detection or prevention?”
  
  My reply:
  
  Typically, in-service detection controls (as opposed to detection-type controls during product development) are entered in the Prevention Controls column of the FMEA. They are part of the design or manufacturing strategy to prevent serious problems. Detection-type Process Controls, on the other hand are tests or analyses that detect a failure mode and associated cause during product development or in the manufacturing plant.
  
  You ask: “For severity rating of 10 or 9 as described in this article, a severity rating of 9 is appropriate with a warning (detection) prior to harm and severity rating of 10 is assigned without warning. In either cases, we don’t talk about any prevention, yet we lower the severity rating?”
  
  My reply: The newer severity scales assign 10 to safety related issues and 9 to regulatory related issues. I’ll reply to the question for the previous-generation scales, as covered in my article. First of all, in-service warning can prevent more serious problems. That is why the authors of the previous-generation scales went from 10 to 9, with warning. It does not prevent the failure mode, but rather can mitigate the seriousness of the effect.
  
  You asked: “I am in a situation where I may have warning (detection or release test available) but cannot be prevented or corrected when product quality is compromised to fail the product specs,”
  
  My reply:
  
  I have a couple of comments.
  
  First, we have to be careful when using the word “detection” to differentiate between detection in-service from detection during product development or manufacturing. They have different meanings. In-service detection is when there is a product feature that provides an early warning to the equipment or user, or monitors usage to provide a system response. Detection controls during product development or manufacturing are tests or analyses that detect a problem before launch or shipping.
  
  Second, I would need to see your severity scale to determine the proper severity ranking for the scenario you provide. If we use the SAE J1739 (2009 version) for PFMEA severity, severity 9 says: “May endanger operator (machine or assembly) with warning.” If the problem you posit is safety or regulatory in nature, you would have to assign a severity 9, in spite of your qualification “cannot be prevented or corrected when product quality is compromised to fail the product specs.”
  
  Reply
Dave says

April 30, 2020 at 9:38 AM

Question about severity:
if I have a line regarding a safety critical system say brakes where there is system redundancy.

should the severity reflect total failure of all systems worst case effect of total loss of all brakes? or should the redundancy be baked into the severity number?

should the severity be 10 and the the probability just be low because all redundancies failing is low likelihood?

a simpler way to ask is… should Severity include design controls or should design controls only effect the probability?

or do I need individual lines with 1 point failure, 2 point failure each with their own severity?

thanks!

Reply
Pedro says

July 15, 2021 at 11:14 PM

Hello, quick question:
If I have a failure mode about say, contamination, that would likely be a 10 because the effect would likely be a recall if the product reaches the end customers.
However, said contamination will be checked in further steps, including the final CQ check to release the product.
Should I use a lower severity in this case (like 100% scrap?) Or I must assume all further processes will also fail?

Reply
- Carl Carlson says
  
  July 17, 2021 at 12:57 PM
  
  Hi Pedro,
  Good question! As you imply, the best practice is usually to consider *both* the Effect at plant level and also the Effect at product level, and enter the worst case in the Severity column. However, if you and the FMEA team are absolutely certain the potential issue will not escape the plant, it is up to your company how to apply FMEA procedure. Unless you are mandated by customer, you can choose to focus the Severity rating on the plant Effect, if you are 100% certain the failure will not escape. I suggest reviewing the FMEA procedure with your management, to be certain they are in the loop.
  Please let me know if this answers your question, or if you have any other questions.
  Thanks.
  Carl
  
  Reply
Sandeep Sharda says

July 17, 2021 at 9:10 AM

Question about giving the factor value for the fluid severity in operation for equipment failure probability and related spare parts inventory acquisition plan :
In this regards i hv prepared a Table ,but it needs to connect with criteria as VED- Vital, essential and desirable
you are requested to provide a spot light ,to have better reflect theproven criteria
The factor value taken in the table is going to use for essential stock calculation for spare parts inventory ( as maintenace requirement)

equipment Fluid Service Criticality (VED) Value (Sc)
Rotating Corrosive fluid V 2.5 -3.5
Hydrocarbon V 2.5 -3.0
Steam Service E 2.0 2.5
Other utility service D 1.5 – 2.0
Stationary Corrosive V 2 – 2.5
Hydrocarbon V 1.5- 2.0
Steam Service E 1 -1.5
Other utility service D 1

Reply
- Carl Carlson says
  
  July 17, 2021 at 1:06 PM
  
  Hi Sandeep,
  Thank you for this question. I will need a bit more information to provide my answer. I’m assuming you are doing an FMEA, since you are using the Severity rating. Can you share a bit of information on the nature of the FMEA are you doing? What is the Function that is being analyzed? And what is an example Failure Mode, and Effect description?
  I’d love to help you, but will need this added information.
  Thanks.
  Carl
  
  Reply
Diane says

August 17, 2021 at 8:16 PM

Hi. How to use the Severity Degradation of Primary and secondary function. Per harness (ex,engine,main,ip)function?or per harness components (ex:connector, wires) function?

Reply
- Carl Carlson says
  
  August 18, 2021 at 6:43 PM
  
  Hi Diane,
  
  The answer to your question depends on the criteria for the Severity scale you are using. For example, if you are doing a Design FMEA within the automotive industry, you might be using the Severity scale in SAE J1739. The Severity Criteria for rating 7 says, “Degradation of primary vehicle function necessary for normal driving during expected service life,” and for rating 5 says, “Degradation of secondary vehicle function.”
  
  Notice in both cases, the assessment is at the *vehicle* level. This is consistent with the definition of Effect, which is “the consequence of the failure on the system or end user.” In the automotive industry, the system is the vehicle.
  
  A degradation of primary vehicle function might be degradation of steering assist. A degradation of secondary vehicle function might be high door-closing effort. Of course, the exact primary and secondary functions are determined by the DFMEA team, in alignment with company policy.
  
  Therefore, regardless of whether you are doing a DFMEA on an engine harness or a connector, the best practice is to take the effect to the system or end user, and assess the severity accordingly.
  
  Hope that helps.
  
  Carl
  
  Reply
alfredoaguilar04 says

September 27, 2021 at 12:09 AM

Hi Carl,

I would like to know how should be considered protections ( Arc Detector, Vibration Sensors, Temperature Sensor, etc) when assessing Severity. These devices should be taken in acount when considering the possibles failures modes? If I have a arc detector installed, should I consider that this device is going to act in order to prevent fire? Or otherwise should I consider that non protections devices are installed to think in the most critical effect in my system/equipment? For instand, if I consider fire as my worst scenario, I should rank severity with 10, due to there are arc detectors installed should rank Detection with 1 or 2, in order to assess Ocurrence, should I calculate the arc detector probability of failure that can lead a fire?

Thanks in advance,

Best Regards

Reply
- Carl Carlson says
  
  September 27, 2021 at 11:56 AM
  
  Hello Alfredo,
  
  Thanks for asking your FMEA question.
  
  There are a couple of options for you to consider.
  
  One option is to consider what is called “Supplemental DFMEA for Monitoring and System Response (DFMEA-MSR).” You can read about this type of supplemental DFMEA in the new SAE J1739 (2021). Here is an excerpt:
  
  “Supplemental FMEA for Monitoring and System Response (FMEA-MSR) is an extension of the DFMEA. FMEA-MSR provides a means of assessing risk reduction due to diagnostic detection with a subsequent response during customer operation. These additional factors contribute to an improved depiction of risk of failure (including risk of harm, risk of noncompliance, and risk of not fulfilling requirements).” The use of Supplemental DFMEA for Monitoring and System Response is possible if certain criteria are met, which are described in SAE J1739 (2021). If you decide to use the supplemental DFMEA-MSR, this will help determine when and if you can accept a lower severity level when using monitoring and response.
  
  Another option is to perform a Design FMEA at the system level, and consider the integrated system functions, including sensors. It will be up to the FMEA team, as guided by company policy, to assess the actual severity rating. The safest way is to identify system functions, failure modes and effects, and rate the worst case end effect. This can be supplemented with lower-level FMEAs on the protective devices (sensors, etc.), in order to make the system as safe as possible. In all cases, when severity is high, if it cannot be reduced, the occ and det ratings must be reduced to very low levels, and overall risk reduced to an acceptable level.
  
  Hope that helps. Feel free to ask any follow-up questions.
  
  Thanks.
  
  Carl
  
  Reply