Accendo Reliability

Your Reliability Engineering Professional Development Site

  • Home
  • About
    • Contributors
  • Reliability.fm
    • Speaking Of Reliability
    • Rooted in Reliability: The Plant Performance Podcast
    • Quality during Design
    • Way of the Quality Warrior
    • Critical Talks
    • Dare to Know
    • Maintenance Disrupted
    • Metal Conversations
    • The Leadership Connection
    • Practical Reliability Podcast
    • Reliability Matters
    • Reliability it Matters
    • Maintenance Mavericks Podcast
    • Women in Maintenance
    • Accendo Reliability Webinar Series
  • Articles
    • CRE Preparation Notes
    • on Leadership & Career
      • Advanced Engineering Culture
      • Engineering Leadership
      • Managing in the 2000s
      • Product Development and Process Improvement
    • on Maintenance Reliability
      • Aasan Asset Management
      • AI & Predictive Maintenance
      • Asset Management in the Mining Industry
      • CMMS and Reliability
      • Conscious Asset
      • EAM & CMMS
      • Everyday RCM
      • History of Maintenance Management
      • Life Cycle Asset Management
      • Maintenance and Reliability
      • Maintenance Management
      • Plant Maintenance
      • Process Plant Reliability Engineering
      • ReliabilityXperience
      • RCM Blitz®
      • Rob’s Reliability Project
      • The Intelligent Transformer Blog
      • The People Side of Maintenance
      • The Reliability Mindset
    • on Product Reliability
      • Accelerated Reliability
      • Achieving the Benefits of Reliability
      • Apex Ridge
      • Metals Engineering and Product Reliability
      • Musings on Reliability and Maintenance Topics
      • Product Validation
      • Reliability Engineering Insights
      • Reliability in Emerging Technology
    • on Risk & Safety
      • CERM® Risk Insights
      • Equipment Risk and Reliability in Downhole Applications
      • Operational Risk Process Safety
    • on Systems Thinking
      • Communicating with FINESSE
      • The RCA
    • on Tools & Techniques
      • Big Data & Analytics
      • Experimental Design for NPD
      • Innovative Thinking in Reliability and Durability
      • Inside and Beyond HALT
      • Inside FMEA
      • Integral Concepts
      • Learning from Failures
      • Progress in Field Reliability?
      • R for Engineering
      • Reliability Engineering Using Python
      • Reliability Reflections
      • Testing 1 2 3
      • The Manufacturing Academy
  • eBooks
  • Resources
    • Accendo Authors
    • FMEA Resources
    • Feed Forward Publications
    • Openings
    • Books
    • Webinars
    • Journals
    • Higher Education
    • Podcasts
  • Courses
    • 14 Ways to Acquire Reliability Engineering Knowledge
    • Reliability Analysis Methods online course
    • Measurement System Assessment
    • SPC-Process Capability Course
    • Design of Experiments
    • Foundations of RCM online course
    • Quality during Design Journey
    • Reliability Engineering Statistics
    • Quality Engineering Statistics
    • An Introduction to Reliability Engineering
    • Reliability Engineering for Heavy Industry
    • An Introduction to Quality Engineering
    • Process Capability Analysis course
    • Root Cause Analysis and the 8D Corrective Action Process course
    • Return on Investment online course
    • CRE Preparation Online Course
    • Quondam Courses
  • Webinars
    • Upcoming Live Events
  • Calendar
    • Call for Papers Listing
    • Upcoming Webinars
    • Webinar Calendar
  • Login
    • Member Home

by nomtbf Leave a Comment

Futility of Using MTBF to Design an ALT

Futility of Using MTBF to Design an ALT

Futility of Using MTBF to Design an ALT

Let’s say we want to characterize the reliability performance of a vendor’s device. We’re considering including the device within our system, if and only if, it will survive 5 years reasonably well.

The vendor’s data sheet lists an MTBF value of 200,000 hours. A call to the vendor and search of their site doesn’t reveal any additional reliability information. MTBF is all we have.

We don’t trust it. Which is wise.

Now we want to run an ALT to estimate a time to failure distribution for the device. The intent is to use an acceleration model to accelerate the testing and a time to failure model to adjust to our various expected use conditions.

Given the device, a small interface module with a few buttons, electronics, a display and enclosure, and the data sheet with MTBF, how can we design a meaningful ALT?

What to Measure

The data sheet and our system’s functionality relying on this device define a range of possible elements to measure. We could measure display brightness, button functionality, response times, life of the electronics, etc.

Before selecting what to measure in the ALT, we need to stop and ask what will limit the life of the device in our application? The provided reliability information doesn’t say. It just says the device has a suspiciously round number MTBF value of 200k hours.

An FMEA, risk analysis, or discussion with the development engineers may narrow down the possible elements of the device that will likely fail first. If time and resources permit, maybe running HALT to find weaknesses (ID failure mechanisms) is on order. Again, just having MTBF doesn’t help.

Which Stress to Apply

Knowing the likely failure mechanism to cause the device to fail is an essential first step to select the appropriate stress (temperature, vibration, power cycling, etc.) to accelerate that failure mechanism.

Not every failure mechanism responds to an increase in temperature. Applying the wrong stress will lead to poor results.

The data sheet might have some environmental or operating limits (power, voltage, temperature, etc.) Those may be clues as to important stresses to explore how they lead to failures.

Like when determining what to measures, we need to sort out which stress, or stresses, provide a means to accelerate the failure mechanism of interest.

Acceleration Model

Let’s say we estimate a rubber seal around the display is likely to fail and could be accelerated using higher temperatures.

Instead of the normal operating temperature of 25°C, let’s double it to 50°C. Ok, so? How much of an acceleration does that change in temperature cause? That is why we need an acceleration model.

The temperature increase might increase the chemical reaction between the material and oxygen and we can use the Arrhenius mo l, if we know or can estimate the activation energy.

Or, the temperature increase may increase the compression of the seal creating a mechanical deformation and damage over time. Here I’m not sure what model to use, yet the Arrhenius model would likely not be useful.

Of course, knowing MTBF provides no information on failure mechanisms other than to suggest the failures are repairable to keep the system running.

Time to Failure Model

Given MTBF we may assume the system has a constant failure rate, or not. Remember all life distributions have a mean value. Knowing the MTBF value doesn’t automatically imply a constant failure rate.

Therefore, if we assume an exponential distribution describes the time to failure pattern, we may be wrong, and most likely would be wrong.

Is the failure arrival pattern decreasing, increasing? We don’t know just knowing MTBF.

Knowing the failure mechanism and how an appropriate stress changes the failure rate is a great start. The design of the ALT includes sample sizes and how and when to make measurements. Knowing the expected pattern of failures given our samples allows us to monitor for failures as appropriate times.

Knowing the inverse of the average failure rate doesn’t really help us know when to expect failures to occur. Thus hampers our ability to design an efficient ALT.

Problems with MTBF Based Reliability Testing Formulas

An astute reader would probably wonder why we’re not using either time or failure truncated test planning and analysis. We have MTBF and that is all we need to design such life tests.

Well, the MTBF value is given and defines the testing. It doesn’t allow us to estimate the time to failure distribution. It may reveal if a system has poorer reliability then expected, yet now if it is better. Nor does such testing permit evaluation or understanding of the pattern of failures.

The MTBF based testing also assumes a constant failure rate. This means if we run 1,000 units for 20 hours, or 20 units run for 1,000 hours it has the same result. If the failure mechanism is wear out or a chemical degradation, then we are more likely to have failures in the units that run longer, and no or few failures in the group that runs for a few hours.

This approach is only appropriate if you know, without doubt, the dominant failure mechanism is best described by an exponential distribution and has an equal chance of failure every single hour of operation. If this is not a certainty, then running 20 or 1,000 units till you have sufficient failures to estimate the time to failure distribution is prudent.

Summary

Running an ALT is expensive. Let’s get the design of the ALT right. That starts by ignoring MTBF claims by vendors, and getting to know the failure mechanisms.

Share
Share on Facebook
Tweet
Tweet this

Filed Under: Uncategorized

« Making the most of FMEA
The Bane Of Our Existence »

Comments

  1. Tim Gaens says

    December 29, 2017 at 11:37 AM

    Next question would be:
    How many MTBF did you proof with your ALT?

    Reply
    • Fred says

      December 29, 2017 at 11:52 AM

      Not sure I understand the question, Tim. Given that ALT’s tend to examine wear out type failure mechanisms, MTBF would not be a suitable metric to use.

      Reply
      • Tim Gaens says

        December 29, 2017 at 1:59 PM

        Sorry Fred,

        Was just playing the manager role here.
        I’m still having a hard time getting people away from MTBF.

        I agree on the article. Thanks for sharing.

        For a manager it is easier to understand MTBF, it simplifies stuf, but it is wrong.

        I like to see more examples how it should be done as common practices.
        e.g. in your article are still a lot of assumptions to be made for the Acceleration Model (and probably always need to be made, after FMEA, product history, field data from similar products, …)
        e.g. should we ask failure mechanism from our suppliers? (and failure rate for each mechanism?)

        Reply
        • Fred says

          December 29, 2017 at 2:21 PM

          No worries Tim, yes ask for failure mechanisms and models (not failure rates) to estimate failure rates given your particular set of environmental and use stresses. cheers, Fred

          Reply
          • Tim Gaens says

            December 29, 2017 at 2:25 PM

            Do you know component supplies that can provide this?

  2. Tim Gaens says

    December 29, 2017 at 2:25 PM

    Rephrase, “willing to” provide this.

    Reply
    • Fred says

      December 29, 2017 at 2:35 PM

      Hi Tim,

      Over the years I’ve worked with many vendors that can and did supply detailed failure mechanism and associated models. Fans, bearings, memory, IGBTs, etc.

      If you don’t ask, you will probably only get MTBF… so ask.

      Cheers,

      Fred

      Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

[popup type="" link_text="Get Weekly Email Updates" link_class="button" ]

Fields marked with an * are required

Subscribe to Weekly Updates

Get updates on the latest content added to the site, including: articles, podcasts, webinars, live events and assorted other reliability engineering professional development materials.

We care about your privacy and will not share, leak, loan or sell your personal information. View our privacy policy.

[/popup]

The Accendo Reliablity logo of a sun face in circuit

Please login to have full access.




Lost Password? Click here to have it emailed to you.

Not already a member? It's free and takes only a moment to create an account with your email only.

Join

Your membership brings you all these free resources:

  • Live, monthly reliability webinars & recordings
  • eBooks: Finding Value and Reliability Maturity
  • How To articles & insights
  • Podcasts & additional information within podcast show notes
  • Podcast suggestion box to send us a question or topic for a future episode
  • Course (some with a fee)
  • Largest reliability events calendar
  • Course on a range of topics - coming soon
  • Master reliability classes - coming soon
  • Basic tutorial articles - coming soon
  • With more in the works just for members
Speaking of Reliability podcast logo

Subscribe and enjoy every episode

RSS
iTunes
Stitcher

Join Accendo

Receive information and updates about podcasts and many other resources offered by Accendo Reliability by becoming a member.

It’s free and only takes a minute.

Join Today

Dare to Know podcast logo

Subscribe and enjoy every episode

RSS
iTunes
Stitcher

Join Accendo

Receive information and updates about podcasts and many other resources offered by Accendo Reliability by becoming a member.

It’s free and only takes a minute.

Join Today

Accendo Reliability Webinar Series podcast logo

Subscribe and enjoy every episode

RSS
iTunes
Stitcher

Join Accendo

Receive information and updates about podcasts and many other resources offered by Accendo Reliability by becoming a member.

It’s free and only takes a minute.

Join Today

Recent Articles

  • test
  • test
  • test
  • Your Most Important Business Equation
  • Your Suppliers Can Be a Risk to Your Project

© 2025 FMS Reliability · Privacy Policy · Terms of Service · Cookies Policy