Is using MTBF better than not using any reliability measure?
This is the core of a blog entry mean-time-between-failure-why-people-do-not-use-the-1-metric-for-equipment-reliability by Ricky Smith, ” Mean Time Between Failure Why People Do Not Use the 1 Metric for Equipment Reliability” (which seems to be removed from his blog at the moment) a few months ago. The recommendation and example described highlights the benefit of using MTBF over not making any reliability measurements.
What is not cited is the number of organizations that abandoned using reliability metrics after using MTBF. I do agree that any measure is better than no measure and the benefit is simply because someone is paying attention. The benefit is short term at best when using the wrong measure. See the Hawthorn Effect for the background of this beneficial effect. http://en.wikipedia.org/wiki/Hawthorne_effect
Management teams will quickly abandon measures, like MTBF, as the measure leads to poor decisions, does not develop a meaningful way to understand the product field behavior, or do not correspond with the data. MTBF is often confused as being ‘reliability’ for a product, so once abandoned the team may consider they have tried to focus on reliability and it was of little value. Thus unlikely to try again.
Those teams that need to focus on measuring reliability for the first time or again often collect the time to failure data necessary for reliability measures including life models and probability of success curves dependent on product age. Then follow the blog’s approach of using MTBF. This strips the data of the information needed to make a meaningful decision. Using MTBF masks the rate of change of the failure rate information in question. If the product has primarily early life failures, using MTBF will make the data appear better than reality. If the product has an increasing failure rate, using MTBF will over estimate short term failure counts.
Gathering the correct information is important. Get the time to failure information and take the next step correctly by using an appropriate and meaningful data analysis method. MTBF is not very meaningful. Using a method that includes the ability to understand the changing nature of failure rates related to product age is appropriate. Weibull analysis or mean cumulative function analysis are good first steps for non-repairable and repairable systems, respectively.
Any engineering team has the ability to learn how to gather data and calculate a clear summary. Doing so will reward the engineering and management team with information needed to make decisions.
Gathering data is for the purpose of making informed decisions. Use meaningful summaries of the data to avoid making poor decisions.
Barry Snider says
The answer to the question “Is MTBF better than doing nothing?” the answer is NO. Using MTBF diverts resources of time, effort, brain power, and focus from something more useful and valuable. Don’t use it, there are better things you could be doing with your time like….preventing failures.
Fred Schenkelberg says
Hi Barry,
Glad you agree and good to hear from you.
cheers,
Fred
PS: Please share this post with your friends.
Perri Hobbs says
It depends on which way your learning about reliability is moving. If you are just starting out in reliability, MTBF is an introduction to failure rate. But then you need to build on your knowledge: moving to a multi MTBF model (such as the one that was used for the hard drive report on LinkedIn a week or so ago) and then to a weibull failure rate model. Nobody learns weibull straight off. You need to know the basics first. And the exponential function is really just a simplified weibull, or the weibull is a generalised exponential.
But if you stay with exponential, then you “may” run into some predictive problems.
Fred Schenkelberg says
G’Day Perri,
Thanks for the comment. I had no thought about the learning curve with reliability. I wonder if there is a better way to learn the basics than using MTBF? How about just the percentage successful at some point of time?
cheers,
Fred
Perri Hobbs says
Fred,
The percentage successful at a point in time is good to. Obviously the percentage will change over time (both f(t) and F(t)). Then you are using a non-parametric model and just using the data, which is what you will probably have to do anyway as most data doesn’t fit a parametric model so neatly.
The point of the exponential function is that it is a nice neat parametric function that can be used for teaching (as well as modelling truly random occurrences), as long as the teaching keeps heading towards more advanced models. It doesn’t stop at weibull either; there is the mixed, the exponentiated and the modified forms of the weibull distribution.
“give an engineer an exponential model and he will solve engineering problems for a day; give him a weibull model and he will solve engineering problems for a week; but teach him how to analyse a distribution, including non-parametric ones, and he will solve engineering problems for a lifetime”
Perri Hobbs says
So yeah,
I think my point would be that you are better off learning how to analyse the data, rather than plugging it into formulae.
Mark Powell says
Perri ,
Several things about your post are disturbing.
1) “MTBF is an introduction to” A CONSTANT “failure rate.” Such does not exist in nature, it violates the second law of thermodynamics.
2) “But then you need to build on your knowledge: moving to a multi MTBF model (such as the one that was used for the hard drive report on LinkedIn a week or so ago) and then to a weibull failure rate model. ” There is no such thing as a “multi MTBF model.” That hard drive article actually showed three failure modes, not a “multi MTBF model.”
3) MTBF is independent of the model, all models have an MTBF, so one does not move on “to a weibull failure rate model.” Plus, there is no such thing as a “weibull failure rate model,” but there is a Weibull model, which is the most general unimodal one-sided model anybody has invented.
4) “the exponential function is really just a simplified weibull, or the weibull is a generalised exponential.” Neither is true. The Weibull with a shape parameter of unity has the same shape as an exponential, but the derivations are radically different and radically different premises.
Now to the good things you said, your closing. You most definitely will have prediction problems using the exponential, that is a lock.
Mark Powell
Perri Hobbs says
Also,
I didn’t read Ricky’s article, until just now. Other than an inability to use spellchecker, even as he is using MTBF to rate his 900 electric motors, he is using the MTBF of the population. Is the problem he is identifying a problem with his population, or just a few reaching some sort of wear out. My point is, you can use whatever distribution you want, but if you can discretise the data enough, predictive trends (even if the are qualitative) will start appearing that you can act on. Although probably using all his great data and then characterising it with MTBF is probably not the way to go.
Fred Schenkelberg says
Hi Perri,
I agree that learning to walk before running is a good idea. Yet, we need more runners that actually look at the data and think.
I generally find ‘MTBF’s’ will tally up the numbers (often time to failure data) and publish the number. No plots of any kind, no analysis, no thinking.
Rather than use MTBF and the associated bad habits related to data analysis, at least plot the data a few ways. And, please don’t use beautiful time to failure information to create MTBF only.
cheers,
Fred
Mark Powell says
Fred,
The biggest problem with using MTBF is that it is a mean, an average. It is very well known that if you use a mean or average in decision making, that you can make very bad decisions. Worse, it is also well known that if you use a mean or average in decision making, that you have no way of detecting (until after the fact) that you have made a very bad decision.
The question really should be if you use MTBF, “Do you feel lucky? Well, do ya punk?” (with apologies to Clint)
Mark Powell