Accendo Reliability

Your Reliability Engineering Professional Development Site

  • Home
  • About
    • Contributors
  • Reliability.fm
    • Speaking Of Reliability
    • Rooted in Reliability: The Plant Performance Podcast
    • Quality during Design
    • Way of the Quality Warrior
    • Critical Talks
    • Dare to Know
    • Maintenance Disrupted
    • Metal Conversations
    • The Leadership Connection
    • Practical Reliability Podcast
    • Reliability Matters
    • Reliability it Matters
    • Maintenance Mavericks Podcast
    • Women in Maintenance
    • Accendo Reliability Webinar Series
  • Articles
    • CRE Preparation Notes
    • on Leadership & Career
      • Advanced Engineering Culture
      • Engineering Leadership
      • Managing in the 2000s
      • Product Development and Process Improvement
    • on Maintenance Reliability
      • Aasan Asset Management
      • AI & Predictive Maintenance
      • Asset Management in the Mining Industry
      • CMMS and Reliability
      • Conscious Asset
      • EAM & CMMS
      • Everyday RCM
      • History of Maintenance Management
      • Life Cycle Asset Management
      • Maintenance and Reliability
      • Maintenance Management
      • Plant Maintenance
      • Process Plant Reliability Engineering
      • ReliabilityXperience
      • RCM Blitz®
      • Rob’s Reliability Project
      • The Intelligent Transformer Blog
      • The People Side of Maintenance
      • The Reliability Mindset
    • on Product Reliability
      • Accelerated Reliability
      • Achieving the Benefits of Reliability
      • Apex Ridge
      • Metals Engineering and Product Reliability
      • Musings on Reliability and Maintenance Topics
      • Product Validation
      • Reliability Engineering Insights
      • Reliability in Emerging Technology
    • on Risk & Safety
      • CERM® Risk Insights
      • Equipment Risk and Reliability in Downhole Applications
      • Operational Risk Process Safety
    • on Systems Thinking
      • Communicating with FINESSE
      • The RCA
    • on Tools & Techniques
      • Big Data & Analytics
      • Experimental Design for NPD
      • Innovative Thinking in Reliability and Durability
      • Inside and Beyond HALT
      • Inside FMEA
      • Integral Concepts
      • Learning from Failures
      • Progress in Field Reliability?
      • R for Engineering
      • Reliability Engineering Using Python
      • Reliability Reflections
      • Testing 1 2 3
      • The Manufacturing Academy
  • eBooks
  • Resources
    • Accendo Authors
    • FMEA Resources
    • Feed Forward Publications
    • Openings
    • Books
    • Webinars
    • Journals
    • Higher Education
    • Podcasts
  • Courses
    • 14 Ways to Acquire Reliability Engineering Knowledge
    • Reliability Analysis Methods online course
    • Measurement System Assessment
    • SPC-Process Capability Course
    • Design of Experiments
    • Foundations of RCM online course
    • Quality during Design Journey
    • Reliability Engineering Statistics
    • Quality Engineering Statistics
    • An Introduction to Reliability Engineering
    • Reliability Engineering for Heavy Industry
    • An Introduction to Quality Engineering
    • Process Capability Analysis course
    • Root Cause Analysis and the 8D Corrective Action Process course
    • Return on Investment online course
    • CRE Preparation Online Course
    • Quondam Courses
  • Webinars
    • Upcoming Live Events
  • Calendar
    • Call for Papers Listing
    • Upcoming Webinars
    • Webinar Calendar
  • Login
    • Member Home

by nomtbf Leave a Comment

MTBF: What is it Good For?

MTBF: What is it Good For?

MTBF: What is it Good For?

Guest post by Andrew Rowland, CRE, ReliaQual Associates, LLC

I.  INTRODUCTION

The mean time between failure (MTBF) is arguably the most prolific metric in the field of reliability engineering. The MTBF is used as a metric throughout a product’s life-cycle; from requirements, to validation, to operational assessment. Unfortunately, MTBF alone doesn’t tell us too much.

It’s not that MTBF is a bad metric. The problem is MTBF is an incomplete metric and, as an incomplete metric, it doesn’t lend itself to risk-informed decision making. The real problem is not with the MTBF, it is with the implicit assumption that failure times are exponentially distributed.

In the following discussion, we will look at two examples where the MTBF alone could lead us to bad decision making.

II.  EXAMPLES

To illustrate how relying on the MTBF can be misleading, let’s look at two examples. In these examples we will assume the failure times are Weibull distributed. The Weibull distribution is popular in reliability engineering and the exponential is a special case of the Weibull. From the literature we know the probability density function and survival (or reliability) function of the Weibull can be expressed as follows:

$latex \displaystyle&s=4 f\left( t \right)=\left( \frac{\beta }{\eta } \right){{\left( \frac{t}{\eta } \right)}^{\beta -1}}{{e}^{-{{\left( \frac{t}{\eta } \right)}^{\beta }}}}$

$latex \displaystyle&s=4 S\left( t \right)={{e}^{-{{\left( \frac{t}{\eta } \right)}^{\beta }}}}$

We also recall that the mean of a Weibull distributed variable can be estimated as:

$latex \displaystyle&s=4 MTBF=\eta \Gamma \left( 1+\frac{1}{\beta } \right)$

In the functions above, η is referred to as the scale parameter and β the shape parameter.

A.  Example 1

Consider  three  items; Item A, Item B, and Item C.  Perhaps the goal is to select one of these items for our design and the requirement   is to  have  a  90 hour  MTBF   or  greater.  All  three  items  have  an  MTBF   of  100 hours. So, from a reliability perspective, which is the Item to choose?

Under the implicit assumption that failure times are exponentially distributed, we might conclude that any of the three is acceptable, reliability-wise. All three satisfy the 90 hours MTBF requirement. However, let’s look a little deeper into the 100 hour MTBF and see if we still agree that any of the three is acceptable.

Let’s take a look at the reliability over time of each Item. Figure 1 shows the reliability function over 500 hours for each of these Items. Clearly, the reliability of these Items  is not the same.  Given that each Item has an MTBF of    hours, what is the reliability at 100 hours? Table I summarizes the 100  hour reliability for each Item. Once again, we can see a large difference between the three Items.

Another way to compare these three Items is via the hazard, or failure, rate. Figure 2 shows the hazard function for each Item. The “bathtub” curve is a plot of hazard rate versus time. Thus, Figure 2 shows the “bathtub” curve for each Item. Clearly the hazard rate behavior is very different for these Items.

SurvivalFunctions

Fig. 1.   Reliability Functions for Item A, Item B, and Item C

TABLE I
RELIABILITY AT 100 HOURS FOR ITEM A, ITEM B, AND ITEM C

Item R(100)
Item A 0.109 (10.9%)
Item B 0.367 (36.7%)
Item C 0.521 (52.1%)

HazardFunctions

Fig. 2.   Hazard Functions for Item A, Item B, and Item C

B.  Example 2

Consider another situation where we have three items; Item D, Item E, and Item F. Presume for a moment that we have all of the data used to derive the MTBF statistic for each Item. The first thing we might do is graphically explore the data. Figure 3 shows a set of plots commonly used in graphical analysis of survival data for Item D. Let’s look at the histogram in the upper left corner. We see the distribution is heavy-tailed indicating failure times are not exponentially distributed.

Compare the histogram in Figure 3 to that in Figure 4 for Item E and Figure 5 for Item F. Clearly the distribution of failures times differs amongst these three items. Yet all three items have the same MTBF. Perhaps we need to look a bit closer at the data!

Now that we’ve graphically analyzed the data and concluded we may be looking at different populations, we decide to fit the data to a distribution and estimate the parameters.

Our goal, then, is to estimate the value of β and η for each Item. We use the fitdist function from the R [1] package fitdistrplus [2]  which uses maximum likelihood to estimate  the parameters. The results for these three populations are summarized in Table II. We can see from these results that the populations are not the same, although all three Items satisfy our 90 hours MTBF requirement.

Now that we’re confident we’re dealing with three different populations all with the same MTBF, what is the implication of selecting one Item over another? Since we fit the data to a Weibull distribution, we know the shape parameter (β) determines the region of the “bathtub” curve. With a β < 1, we are in the early life region, a β = 1 puts us in the useful life region, and a β  > 1 indicates wearout.  In other words, Item  D is dominated by early-life failure mechanisms, Item E is by useful life failure mechanisms, and Item F by wearout.

ItemDPlots

Fig. 3.   Item D: Graphical Analysis of Survival Data

ItemEPlots

Fig. 4.   Item E: Graphical Analysis of Survival Data

ItemFPlots

Fig. 5.   Item F: Graphical Analysis of Survival Data

As we  did with the first  example, let’s look at the  reliability function  for  these  three  Items. Figure 6 shows the reliability functions. Similar to the first example, we see the reliability functions are not the same as we would expect from our assessment of Figure 3, Figure 4, and Figure 5.

SurvivalFunctions2

Fig. 6. Reliability Functions for Item D, Item E, and Item F

Let’s assume we are interested in the reliability at 50 hours. The reliability at 50 hours for the three Items can be found in Table III. We see a dramatic difference in the reliabilities and, interestingly, the Item with the highest 50 hour reliability is the Item with the lowest MTBF.

We can also look at plots of the hazard function for these three Items. These hazard functions are plotted in Figure 7 over 500 hours. We see different hazard rate behaviors as we expected from our assessment of the β values we estimated earlier. TABLE II

ESTIMATED PARAMETERS FOR ITEM D, ITEM E, AND ITEM F

Item R(100) Beta MTBF
Item D 101.42 0.478 220.7
Item E 107.73 1.000 107.7
Item F 100.84 4.524 92.0

TABLE III

RELIABILITY AT 50 HOURS FOR ITEM D, ITEM E, AND ITEM F

Item R(50)
Item D 0.490 (49.0%)
Item E 0.645 (64.5%)
Item F 0.959 (95.9%)

HazardFunctions2

Fig. 7. Hazard Functions for Item D, Item E, and Item F

III.   CONCLUSION

Hopefully we’ve come to understand that stating an MTBF value with no other information doesn’t really tell us much about the reliability of an Item. Neither does it tell us if the Item truly satisfies our reliability needs. We saw in one example three Items with the same MTBF, but most definitely with different reliability behavior.

In the second example, we looked at three Items with different MTBF. Once again, we saw the reliability behavior of these Items were different. In this example we saw the Item with the largest MTBF having a 50 hour reliability almost half that of the Item with the lowest MTBF.

Without an understanding of the reliability characteristics that is more complete than simply MTBF are we making good, risk-informed decisions? Selecting Item A or Item D, we can expect to see high rates of failure during validation, reliability growth testing, or, worse yet, early in customer ownership. If we warrant our product, we can expect large warranty costs associated with Item A or Item D. Given the competing requirements we need to satisfy, we may need to select Item A orItem D. If we only know the MTBF will we put the necessary barriers in place, such as screening, to minimize the risk?

At the other end of the “bathtub” curve, if we select Item C or Item F, our validation or reliability growth testing may not test far enough into wearout to surface failures. Will we develop a preventive maintenance program for these Items to minimize the risk?

MTBF is ingrained in the reliability community as well as throughout most companies. It is unlikely that we will ever see the end of MTBF. Ultimately it comes down to us, as reliability engineers, to understand the limitations of MTBF and educate those around us to it’s shortcomings. If the reliability community gets in lock-step, we can be the tugboats that change theship’s heading.

REFERENCES

[1]  R Development Core Team, R: A Language and Environment for Statistical Computing.            Vienna, Austria: R Foundation for Statistical Computing, 2009.

[2]  Marie Laure Delignette-Muller and Regis Pouillot and Jean-Baptiste Denis and Christophe Dutang, fitdistrplus: help to fit of a parametric distribution to censored or non-censored data. 2013

Andrew Rowland is a Reliability Consultant. He previously worked as a Reliability and Safety Engineer in the aerospace, defense, and civil nuclear industries. Mr. Rowland received a BSEE in 1999 and a MS in Statistics in 2006. He is an American Society for Quality Certified Reliability Engineer, a member of the IEEE Reliability Society, and the American Statistical Association. He may be contacted by email at andrew.rowland@reliaqual.com.

For a pdf copy download from slideshare

Filed Under: Articles, NoMTBF

« Design of Experiments
HALT Value »

Comments

  1. William Thorlay says

    May 25, 2013 at 10:13 PM

    I’ve been following Mr. Mark Pwell’s discussions on the Linkedin groups and very much agree with his positions against the misuse of the MTBF. Can you provide the present article in PDF format?

    Reply
  2. Andrew Rowland says

    May 26, 2013 at 3:54 PM

    Send me an e-mail or contact me on LinkedIn and I will provide the article in PDF format. I also have the three data sets used in Example 2 for anyone that may be interested.

    Reply
    • Fred Schenkelberg says

      May 26, 2013 at 4:03 PM

      I’ve also added a downloadable version on the post page (at bottom) via Slideshare. – Fred

      Reply
  3. Brian Wharton says

    May 26, 2013 at 7:42 PM

    A good article. Each industry has different ways of using reliability metrics and for each there is a good way of proposing an alternative. One of my successes was to propose the use of physics of failure to designers, most of which already understood why and how their parts might fail. Once the design team have a way of converting the potential failure mechanics the rest is simple.

    Reply
    • Fred Schenkelberg says

      May 26, 2013 at 7:50 PM

      Hi Brian,

      PoF is great with engineers as they do like the physics and chemistry of stuff and also can relate to the models used in PoF.

      cheers,

      Fred

      Reply
  4. Mark Fessler says

    June 3, 2013 at 4:15 PM

    Thanks. Its a good reminder.

    I think (at least in my semiconductor industry) is looking at foreseeable misuse it is also an important factor to remind deign engineers of… that the “human error rate” sometimes influences failure at a much lower incident rate – MTBF(human) and like your article describes, you need to look at all the information before making good choices. Thanks. Have you ever though of a system analysis that considers a human-influenced MTBF?

    Reply
    • Fred Schenkelberg says

      June 3, 2013 at 10:57 PM

      Hi Mark,

      Thanks for the comment and insight. Yes, I’ve often considered the impact on field failure caused by either mis-use, improper use, mistakes, etc. Although, as you may suspect, I do not use MTBF. Rather I work to determine the reliability, R(t).

      Cheers,

      Fred

      Reply
  5. Zoran says

    October 1, 2013 at 3:07 PM

    MTBF is an assumption. The mother of all disasters in safety systems is “assumption”.

    Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

[popup type="" link_text="Get Weekly Email Updates" link_class="button" ]

[/popup]

The Accendo Reliablity logo of a sun face in circuit

Please login to have full access.




Lost Password? Click here to have it emailed to you.

Not already a member? It's free and takes only a moment to create an account with your email only.

Join

Your membership brings you all these free resources:

  • Live, monthly reliability webinars & recordings
  • eBooks: Finding Value and Reliability Maturity
  • How To articles & insights
  • Podcasts & additional information within podcast show notes
  • Podcast suggestion box to send us a question or topic for a future episode
  • Course (some with a fee)
  • Largest reliability events calendar
  • Course on a range of topics - coming soon
  • Master reliability classes - coming soon
  • Basic tutorial articles - coming soon
  • With more in the works just for members
Speaking of Reliability podcast logo

Subscribe and enjoy every episode

RSS
iTunes
Stitcher

Join Accendo

Receive information and updates about podcasts and many other resources offered by Accendo Reliability by becoming a member.

It’s free and only takes a minute.

Join Today

Dare to Know podcast logo

Subscribe and enjoy every episode

RSS
iTunes
Stitcher

Join Accendo

Receive information and updates about podcasts and many other resources offered by Accendo Reliability by becoming a member.

It’s free and only takes a minute.

Join Today

Accendo Reliability Webinar Series podcast logo

Subscribe and enjoy every episode

RSS
iTunes
Stitcher

Join Accendo

Receive information and updates about podcasts and many other resources offered by Accendo Reliability by becoming a member.

It’s free and only takes a minute.

Join Today

Recent Articles

  • test
  • test
  • test
  • Your Most Important Business Equation
  • Your Suppliers Can Be a Risk to Your Project

© 2025 FMS Reliability · Privacy Policy · Terms of Service · Cookies Policy