Accendo Reliability

Your Reliability Engineering Professional Development Site

  • Home
  • About
    • Contributors
  • Reliability.fm
    • Speaking Of Reliability
    • Rooted in Reliability: The Plant Performance Podcast
    • Quality during Design
    • Way of the Quality Warrior
    • Critical Talks
    • Dare to Know
    • Maintenance Disrupted
    • Metal Conversations
    • The Leadership Connection
    • Practical Reliability Podcast
    • Reliability Matters
    • Reliability it Matters
    • Maintenance Mavericks Podcast
    • Women in Maintenance
    • Accendo Reliability Webinar Series
  • Articles
    • CRE Preparation Notes
    • on Leadership & Career
      • Advanced Engineering Culture
      • Engineering Leadership
      • Managing in the 2000s
      • Product Development and Process Improvement
    • on Maintenance Reliability
      • Aasan Asset Management
      • AI & Predictive Maintenance
      • Asset Management in the Mining Industry
      • CMMS and Reliability
      • Conscious Asset
      • EAM & CMMS
      • Everyday RCM
      • History of Maintenance Management
      • Life Cycle Asset Management
      • Maintenance and Reliability
      • Maintenance Management
      • Plant Maintenance
      • Process Plant Reliability Engineering
      • ReliabilityXperience
      • RCM Blitz®
      • Rob’s Reliability Project
      • The Intelligent Transformer Blog
      • The People Side of Maintenance
      • The Reliability Mindset
    • on Product Reliability
      • Accelerated Reliability
      • Achieving the Benefits of Reliability
      • Apex Ridge
      • Metals Engineering and Product Reliability
      • Musings on Reliability and Maintenance Topics
      • Product Validation
      • Reliability Engineering Insights
      • Reliability in Emerging Technology
    • on Risk & Safety
      • CERM® Risk Insights
      • Equipment Risk and Reliability in Downhole Applications
      • Operational Risk Process Safety
    • on Systems Thinking
      • Communicating with FINESSE
      • The RCA
    • on Tools & Techniques
      • Big Data & Analytics
      • Experimental Design for NPD
      • Innovative Thinking in Reliability and Durability
      • Inside and Beyond HALT
      • Inside FMEA
      • Integral Concepts
      • Learning from Failures
      • Progress in Field Reliability?
      • R for Engineering
      • Reliability Engineering Using Python
      • Reliability Reflections
      • Testing 1 2 3
      • The Manufacturing Academy
  • eBooks
  • Resources
    • Accendo Authors
    • FMEA Resources
    • Feed Forward Publications
    • Openings
    • Books
    • Webinars
    • Journals
    • Higher Education
    • Podcasts
  • Courses
    • 14 Ways to Acquire Reliability Engineering Knowledge
    • Reliability Analysis Methods online course
    • Measurement System Assessment
    • SPC-Process Capability Course
    • Design of Experiments
    • Foundations of RCM online course
    • Quality during Design Journey
    • Reliability Engineering Statistics
    • Quality Engineering Statistics
    • An Introduction to Reliability Engineering
    • Reliability Engineering for Heavy Industry
    • An Introduction to Quality Engineering
    • Process Capability Analysis course
    • Root Cause Analysis and the 8D Corrective Action Process course
    • Return on Investment online course
    • CRE Preparation Online Course
    • Quondam Courses
  • Webinars
    • Upcoming Live Events
  • Calendar
    • Call for Papers Listing
    • Upcoming Webinars
    • Webinar Calendar
  • Login
    • Member Home

by nomtbf Leave a Comment

The Challenges in Reliability Engineering

The Challenges in Reliability Engineering

What are the Other Challenges in Reliability

Creating a product or system that lasts as long as expected, or longer, is a challenge.

It’s a common challenge that reliability engineering and entire engineering team face on a regular basis. It’s also not our only challenge.

We face and solve a myriad of technical, political, and engineering challenges. Some of our challenges are born and carried forward by our own industry. We have tools suitable for a given purpose altered to ‘fit’ another situation (inappropriately and creating misleading results). We have terms that we, and our peers, struggle to understand.

Sometimes, we, as reliability engineers have set up challenges that thwart our best efforts to make progress.

Let’s examine a few of the self made challenges and discuss ways to overcome these obstacles permitting us to tackle the real hurdles in our path.

MTBF and Prediction are The Two Big Issues

This site has the expressed goal to ‘eradicate MTBF’. It is the worst four letter acronym in our world. You already know this and so many of the readers here have taken steps to see this term relegated to the dust of forgotten history.

Parts count predictions, especially from our favorite military standard, is another widely known to be less then useful. Then why do we continue to find requirements to use this method as a basis to estimate actual future field failure rates?

Even 20 years after 217’s retirement/obsolesce it lives. Again, there are teams working on viable and actually useful alternatives. Physical of failure modeling, improved reliability modeling tools that permit (nay encourage) the use of appropriate life time distributions, and other work is slowly weening our industry from the folly of parts count predictions.

HALT: “let’s pass HALT”

This one isn’t discussed too often. Yet, have you heard someone wonder if their product could pass HALT?
How about, ‘of course it failed you were testing above the specified use level..’

HALT is the second worst four letter acronym.

We have a ways to go to make this basic concept clear. We are going to employ a process of stress stress to discover weaknesses in the design. We are going to use elevated stresses to discover problems and margins quicker.

Cost of Failure

Engineers know intuitively that failures are bad. The design effort includes actions to design a robust and reliable product.

One tool that we often avoid employing is the actual or estimated cost of a failure. We tend to focus on failure rates and failure mechanisms, which is fine to a point. Yet, if we do not also include the consequence (safety, warranty, brand loyalty, customer losses, etc.) we only enjoy half the information we need to enable great decisions.

Our team needs to work on the potential and actual failures that make a difference when solved. Not all failure modes are the same. Let’s solve the ones that save the most lives, anguish, and money.

Get the information you need for your product to determine the cost per failure. This information along with a expected shipping volume and estimated failures rates enables the calculation of the cost of failure.

If you calculate the cost of failure per unit shipped, you have a value that is comparable to the bill of material cost of the materials and components in a product. In my experience, the cost of failure per unit shipped is the most expensive or within the top 5 most expensive components in a product.

We employ teams of engineers to develop a single critical component, to cost reduce an expensive component, and our ignorance allows wonderful opportunities for savings to remain hidden.

Determine the cost of failure and make that information widely available to your team. Show them how to use the information to weigh the everyday decision they make during design and development.

Mixed Priorities

I’ve been told product reliability is critical than asked to use less then half the sample size necessary for an accelerated life test.

Critical, important, and top priority are great terms. They sound great. If they do not come with resources, personnel, budgets, and support, those terms are hollow platitudes suggesting our work on reliability is critical, important, or a top priority.

I’m not suggesting, although often really do believe, reliability performance is a top priority. Organization have many priorities and I get that. The challenge is in the mixed signals. The unclear priorities. The many top priorities.

The remedy is to quantify the cost of failure again. Management, mostly, talks in terms of money. So, we need to convert a 1% failure rate into dollars lost to warranty per year. We need to quantify the cost of uncertainty, especially when the uncertainly ranges from none to billions in potential losses. A 10% chance that we have a major safety issue for a $100 million product line suggests the likely loss is $10 million unless we reduce the risk. Few other product risks involve such threats to profit and business viability.

Part of why reliability isn’t well positioned in the pantheon of priorities is it is difficult to quantify. At least that is my observation. Difficult doesn’t mean impossible.

Reliability is one of most organizations set of priorities to get right. Let’s help our teams align the ability to deliver the expected reliability to achieve the goals, while properly balancing with other priorities.

Summary

There are challenges in the world of reliability engineering. MTBF and predictions are well known and many are working to help us and our peers move forward.
HALT, Cost of Failure, and Mixed Priorities are 3 of the many challenges you face on a regular basis. What would you add to this list? How can we, as a community of reliability engineers do to solve them? Add you suggestions and recommendations in the comments section below.

Filed Under: Uncategorized

« Quantify Your Improvements with Maintenance Planning & Scheduling
Introduction to the Essential Characteristics of an RCA Program »

Comments

  1. Rick Kossik says

    April 20, 2017 at 9:12 AM

    I believe your most important point here is that proper design requires not just modeling failure (and repairs), but modeling the consequences of failures. This is a much more difficult task than traditional reliability modeling, as it requires a “total system model” that not only simulates the components that can fail (and perhaps be repaired), but also models (in detail) the consequences of different types of failures. Only then is it possible to focus on the failures that are important.

    A simple example of this is a water resource system. If a pump fails, how does it affect the rest of the system? Is the failure simply an inconvenience or does it lead to catastrophe (e.g., a dam failure)? Perhaps usually it is just the former, but if it fails during a storm event, it could be the latter. Moreover, although storms may be rare, the pump may in fact be more likely to fail during a storm (i.e., failure rates may increase during storm events), and this should be quantitatively represented in the model. So to properly understand the consequences of failure requires that you model the total system (dynamically and probabilistically), representing, for example, storm events, as well as the actual feedback loops that exist in the system.

    A few of our customers have done this, including NASA, Sandia National Laboratories, and Los Alamos National Laboratory, but it is the exception, not the rule. I think the primary reason is that doing so requires a team approach. Most reliability engineers lack the background to model the “total system”, and those with the background typically lack the required reliability engineering skills. Hence, modeling such a system properly requires a team of individuals who together possess the necessary skills. This can be time-consuming and expensive, and hence is not typically done (of course, the ultimate cost of failure may be much more expensive, but this is rarely taken into account).

    Reply
    • Fred Schenkelberg says

      April 20, 2017 at 10:53 AM

      Thanks Rick for the comment and story. As you suggest these models can become rather complex, yet even considering the consequences will go a long way to help sort out priorities. cheers, Fred

      Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

[popup type="" link_text="Get Weekly Email Updates" link_class="button" ]

[/popup]

The Accendo Reliablity logo of a sun face in circuit

Please login to have full access.




Lost Password? Click here to have it emailed to you.

Not already a member? It's free and takes only a moment to create an account with your email only.

Join

Your membership brings you all these free resources:

  • Live, monthly reliability webinars & recordings
  • eBooks: Finding Value and Reliability Maturity
  • How To articles & insights
  • Podcasts & additional information within podcast show notes
  • Podcast suggestion box to send us a question or topic for a future episode
  • Course (some with a fee)
  • Largest reliability events calendar
  • Course on a range of topics - coming soon
  • Master reliability classes - coming soon
  • Basic tutorial articles - coming soon
  • With more in the works just for members
Speaking of Reliability podcast logo

Subscribe and enjoy every episode

RSS
iTunes
Stitcher

Join Accendo

Receive information and updates about podcasts and many other resources offered by Accendo Reliability by becoming a member.

It’s free and only takes a minute.

Join Today

Dare to Know podcast logo

Subscribe and enjoy every episode

RSS
iTunes
Stitcher

Join Accendo

Receive information and updates about podcasts and many other resources offered by Accendo Reliability by becoming a member.

It’s free and only takes a minute.

Join Today

Accendo Reliability Webinar Series podcast logo

Subscribe and enjoy every episode

RSS
iTunes
Stitcher

Join Accendo

Receive information and updates about podcasts and many other resources offered by Accendo Reliability by becoming a member.

It’s free and only takes a minute.

Join Today

Recent Articles

  • test
  • test
  • test
  • Your Most Important Business Equation
  • Your Suppliers Can Be a Risk to Your Project

© 2025 FMS Reliability · Privacy Policy · Terms of Service · Cookies Policy