Accendo Reliability

Your Reliability Engineering Professional Development Site

  • Home
  • About
    • Contributors
  • Reliability.fm
    • Speaking Of Reliability
    • Rooted in Reliability: The Plant Performance Podcast
    • Quality during Design
    • Way of the Quality Warrior
    • Critical Talks
    • Dare to Know
    • Maintenance Disrupted
    • Metal Conversations
    • The Leadership Connection
    • Practical Reliability Podcast
    • Reliability Matters
    • Reliability it Matters
    • Maintenance Mavericks Podcast
    • Women in Maintenance
    • Accendo Reliability Webinar Series
  • Articles
    • CRE Preparation Notes
    • on Leadership & Career
      • Advanced Engineering Culture
      • Engineering Leadership
      • Managing in the 2000s
      • Product Development and Process Improvement
    • on Maintenance Reliability
      • Aasan Asset Management
      • AI & Predictive Maintenance
      • Asset Management in the Mining Industry
      • CMMS and Reliability
      • Conscious Asset
      • EAM & CMMS
      • Everyday RCM
      • History of Maintenance Management
      • Life Cycle Asset Management
      • Maintenance and Reliability
      • Maintenance Management
      • Plant Maintenance
      • Process Plant Reliability Engineering
      • ReliabilityXperience
      • RCM Blitz®
      • Rob’s Reliability Project
      • The Intelligent Transformer Blog
      • The People Side of Maintenance
      • The Reliability Mindset
    • on Product Reliability
      • Accelerated Reliability
      • Achieving the Benefits of Reliability
      • Apex Ridge
      • Metals Engineering and Product Reliability
      • Musings on Reliability and Maintenance Topics
      • Product Validation
      • Reliability Engineering Insights
      • Reliability in Emerging Technology
    • on Risk & Safety
      • CERM® Risk Insights
      • Equipment Risk and Reliability in Downhole Applications
      • Operational Risk Process Safety
    • on Systems Thinking
      • Communicating with FINESSE
      • The RCA
    • on Tools & Techniques
      • Big Data & Analytics
      • Experimental Design for NPD
      • Innovative Thinking in Reliability and Durability
      • Inside and Beyond HALT
      • Inside FMEA
      • Integral Concepts
      • Learning from Failures
      • Progress in Field Reliability?
      • R for Engineering
      • Reliability Engineering Using Python
      • Reliability Reflections
      • Testing 1 2 3
      • The Manufacturing Academy
  • eBooks
  • Resources
    • Accendo Authors
    • FMEA Resources
    • Feed Forward Publications
    • Openings
    • Books
    • Webinars
    • Journals
    • Higher Education
    • Podcasts
  • Courses
    • 14 Ways to Acquire Reliability Engineering Knowledge
    • Reliability Analysis Methods online course
    • Measurement System Assessment
    • SPC-Process Capability Course
    • Design of Experiments
    • Foundations of RCM online course
    • Quality during Design Journey
    • Reliability Engineering Statistics
    • Quality Engineering Statistics
    • An Introduction to Reliability Engineering
    • Reliability Engineering for Heavy Industry
    • An Introduction to Quality Engineering
    • Process Capability Analysis course
    • Root Cause Analysis and the 8D Corrective Action Process course
    • Return on Investment online course
    • CRE Preparation Online Course
    • Quondam Courses
  • Webinars
    • Upcoming Live Events
  • Calendar
    • Call for Papers Listing
    • Upcoming Webinars
    • Webinar Calendar
  • Login
    • Member Home

by nomtbf Leave a Comment

Predicting Failure vs. Reacting to Failure

Predicting Failure vs. Reacting to Failure

14598507469_9c031d1fe3_oPredicting Failure vs. Reacting to Failure

One of the twitter notes I sent out a few weeks ago in part read, “Celebrate failures”. And a comment came back that it was a wonderful approach that she had not though of before. Failure will occur and when it does it is our chance to learn.

And, we need to learn. As reliability professionals, we continue to learn our entire career. New materials fail in novel manners. New assemblies fail in an assortment of ways. New designs fail due to unknown sources of variation. We will see failures. So rather than simply focus on the next try and hope to find success, let’s learn from each failure as we move toward success.

Do you work in a fire department?

It an expression that I use as have others to describe an organization that quickly responds to failures. A customer calls with a problem, the team jumps into action to solve the issue. A phone call on Friday afternoon brings a chance to work all weekend. The line is down, all hands on deck.

The better fire departments actually do a good job responding and solving problems. They may even work to prevent other similar problems form occurring. They are not very good at determining where the next failure will occur, so they remain diligent and ready to respond.

Heroes are born  in a fire department organization. The one that saves the big customer account, get a prime parking spot. The engineer that pulls an all nighter to get the line running, get noticed for promotion. The message is get good at solving crises problems. The problems you solve as part of your day job, doesn’t really count. The spoils go to the solution found at the last minute, under duress, and often after hours.

What have you done of value lately?

In some fire department like organizations, unless you personally abated a major crises, you’re not noticed. Let’s say you do your work well. You craft durable products, work with teams to create reliable solutions, and meet your cost, time to market, performance and quality targets.

Not one bit of recognition or notice. You did your job. Did well even, yet that is expected isn’t it.

Let’s say the same brilliant folks that stamps out raging rates of failure find the time to actually design a product that doesn’t fail unexpectedly. Let’s say your organization does a full root causes analysis, including where the life cycle set of system failed to avert the chain of events leading to the field escalation.

Once we understand that failures will occur, we might take steps to anticipate and resolve those failures before a customer has the luxury of a failure experience. During the design and development stages, we start to balance the final design decision based on acceptable risk of failure and minimum risk of unknown failures. We begin to build certainty in knowing what will fail, when it fail and how often… and work to create a low enough failure rate.

The idea is to predict, anticipate, forecast, estimate and celebrate failures. Running a test that has no failure is a lost opportunity to learn something that allows us to improve the design. Finding a design flaw early, permits a routine amount of work to fix (i.e. If all night and all weekend pushes to meet shipping deadlines is your normal, you need to see what’s possible).

Predicting Failures

Everything will fail. It’s a matter of when and how. We regularly think about limitation, constraints, and failure modes. Now I’m asking you to consider the failure mechanisms. What chemistry or physics or elemental shift in the design would lead to failure.

It’s not someone else job to add reliability to a system. It’s anyone making decisions about part or vendor selection, anyone sizing a bolt, anyone thinking though or performing maintenance. It pretty mush is anyone that touches in some way the product. It’s your job.

What could fail? Have you discovered new failure mechanisms today? Let’s reward those that discover the first failure, the most failures, or the biggest cost avoidance failure. Let’s give the prime parking spot to someone that finds and fixes a critical flaw before it’s an emergency.

The work in physics of failure, HALT, and risk analysis are just a few of the tools available to prediction what will fail. Coupled with someone willing to consider and prove what will fail, and when, (we call these folks reliability engineers) the team can shift out of firefighting to fire prevention. Celebrating the lack of failures seen by customers, and rarely being surprised by what does fail.

Give it a try, step off alert status and think about what could fail. Sort out how you can find out before starting the line or shipping. Put in the hours now to carefully find what will fail, so you and your team can work to avert those many pending Friday afternoon phone calls from another irate customer.

Filed Under: Articles, NoMTBF

« Ethics and Your Work as a Reliability Engineer
Why the Drain in the Bathtub Curve Matters »

Comments

  1. Nol Geurts says

    December 16, 2015 at 12:25 PM

    Great summary of the often seen situation, I also get the impression that they even think earning money with an unreliable product. Simply because the customer needs a replacement!

    Reply
    • Fred Schenkelberg says

      December 16, 2015 at 12:28 PM

      Thanks for the comment Nol. And, yes sometimes the business model create more profit with repairs than the original sale. cheers, Fred

      Reply
  2. Pradeep Kumar says

    December 16, 2015 at 8:16 PM

    DFMEA (Design Failure Mode Effect Analysis) with action plan testing is must to release any new product in Market.

    Reply
    • Fred Schenkelberg says

      December 18, 2015 at 11:34 AM

      FMEA is a good tool, and it has both the reactive elements given what we already know, plus the proactive by those elements we discover. cheers, Fred

      Reply
  3. Mohan Dudani says

    December 20, 2015 at 7:09 AM

    Fred

    Good insight on failures. My focus is an operating environment – discrete or continuous where we should predict remaining useful life of the equipment and this science is still in infancy. But I am sure we will catch up in future. What are your thoughts regarding RUL in an operating environment?

    Reply
    • Fred Schenkelberg says

      December 20, 2015 at 7:59 AM

      Hi Mohan,

      thanks for the kind words and comment. I’m not sure what you mean by ‘RUL’ – glad to comment if I know the meaning of the abbreviation.

      You are correct that we are just getting started with the science of reliability engineering. Being able to monitor and measure failure mechanisms as they are in progress is reliability new. The entire field of prognostic health management has really accelerated given the improvements in capability and cost of a wide range of sensors and communication networks.

      More to come along, I’m sure.

      Cheers,

      Fred

      Reply
  4. Mohan Dudani says

    December 20, 2015 at 8:51 AM

    Hi Fred

    In the world of Prognosis, RUL- Remaining Useful Life, is defined as time to imminent failure rather than the time to actual failure. The difference between the times is that the imminent failure is the time when the health indicator exceeds a set threshold. The time to actual failure is when the system failure occurs

    Reply
    • Fred Schenkelberg says

      December 21, 2015 at 9:42 AM

      Ok, got it. There are two elements and I’ve seen this working in a few operating environments. First, track a measure that shows the degradation toward a defined failure. Second, model that degradation.

      It is not a new concept as there are models and information going back many decades. It’s our ability to make the measurements that has been changing. Also, keep in mind that not all failure mechanisms follow a defined and smooth path to failure. Oil breaks down in smooth fashion, yet a flange experiencing an load above it’s shear strength is not predictable.

      I generally start with an understanding of the failure mechanisms, then look for what I can measure to track the progress of the mechanisms.

      Not always possible, yet often we can find something that is useful.

      Cheers,

      Fred

      Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

[popup type="" link_text="Get Weekly Email Updates" link_class="button" ]

[/popup]

The Accendo Reliablity logo of a sun face in circuit

Please login to have full access.




Lost Password? Click here to have it emailed to you.

Not already a member? It's free and takes only a moment to create an account with your email only.

Join

Your membership brings you all these free resources:

  • Live, monthly reliability webinars & recordings
  • eBooks: Finding Value and Reliability Maturity
  • How To articles & insights
  • Podcasts & additional information within podcast show notes
  • Podcast suggestion box to send us a question or topic for a future episode
  • Course (some with a fee)
  • Largest reliability events calendar
  • Course on a range of topics - coming soon
  • Master reliability classes - coming soon
  • Basic tutorial articles - coming soon
  • With more in the works just for members
Speaking of Reliability podcast logo

Subscribe and enjoy every episode

RSS
iTunes
Stitcher

Join Accendo

Receive information and updates about podcasts and many other resources offered by Accendo Reliability by becoming a member.

It’s free and only takes a minute.

Join Today

Dare to Know podcast logo

Subscribe and enjoy every episode

RSS
iTunes
Stitcher

Join Accendo

Receive information and updates about podcasts and many other resources offered by Accendo Reliability by becoming a member.

It’s free and only takes a minute.

Join Today

Accendo Reliability Webinar Series podcast logo

Subscribe and enjoy every episode

RSS
iTunes
Stitcher

Join Accendo

Receive information and updates about podcasts and many other resources offered by Accendo Reliability by becoming a member.

It’s free and only takes a minute.

Join Today

Recent Articles

  • test
  • test
  • test
  • Your Most Important Business Equation
  • Your Suppliers Can Be a Risk to Your Project

© 2025 FMS Reliability · Privacy Policy · Terms of Service · Cookies Policy