Given MTBF? Now What?
Let’s say you join a project as a reliability professional (or an engineer or manager of any type) and you discover that the team as a reliability goal stated as 5,000 hours MTBF.
What are you going to do with that information?
The meaning of the current goal
5,000 hour MTBF might be exactly the right metric and value for the project. It probably is not.
To find out what this value means you need some more information. Ask around and find:
- What is the primary function of the device?
- How long should it provide that function for customers?
- Is there a warranty period or executed lifetime duration?
- How many hours per year will the device be operated?
- How many should survive the warranty period or expect useful lifetime?
- In what environment should it work?
Note these questions help you find each of the four element for a full reliability goal description. Key to this discovery is the duration of useful life (or warranty) along with how many are expected to survive each duration.
The durations are often linked to market expectations. The probabilities (survival) is connected to business objectives of profitability along with customer satisfaction.
Let’s say we determine that there is a one year warranty and the business objectives expect less than 2% of units will fail during the warranty period.
A simple calculation is now possible. Assuming the unit is susceptible to failure every hour of the year, or 8,760 hours, then
$latex \displaystyle&s=4 R(87600)={{e}^{-\left( {}^{8760}\diagup{}_{5000}\; \right)}}=0.17$
That means the goal is to have only 17% of units survive the warranty period.
That is not very good.
A bit more information
I would show this simple calculation to others and ask if the reliability goal was correct.
Let’s say we quickly learn that the most likely way the product will fail is due to fan failure. And, we discover the unit is only expected to operate 2,000 hours per year.
Changing the calculation to reflect the reliability at 2,000 hours we find
$latex \displaystyle&s=4 R(2000)={{e}^{-\left( {}^{2000}\diagup{}_{5000}\; \right)}}=0.67$
Which is 67% of units would be expected to survive.
Still not great.
Further discussions
We may find with additional discussion a few reasons for such a poor reliability goal.
It could be
- Thinking MTBF was a failure free period
- thinking 5,000 hour MTBF was 2.5x longer than expected 2,000 operation, thus had plenty of margin
- 5,000 hours was goal for last project and we copied it to this one
- This is a new high risk project and we set the goal very low on purpose
- This is an improvement of the last product that had 50% fail in warrantee period
Or, something else.
Whatever, if the business goal is have less than 2% fail in the warranty period and our engineering goal is to expect 33% of device fail, something needs adjusting.
Next Steps
This simple example, which is all too common, illustrates that even a simple calculation to interpret MTBF and compare the results to expectations may cause ‘some discussion’. Hopefully, it helps the organization to adjust the goal and how it is stated to be something meaningful.
Given this situation you may want to take the next steps:
- Restate the reliability goal in terms of reliability (function, environment, duration, probability) In this case the device will function in an office setting for one year with 98% surviving. Add other duration/probability couplets as needed for clarity.
- Create an apportionment model for the device and major subsystems/components.
- Identify past product or similar product performance – also in terms of reliability (not MTBF).
- Identify high risk of failure areas and focus engineering and supply chain improvements there.
After device launch to the market
Once the product is out there, monitor it’s reliability performance. Again, not in terms of MTBF as we really are not interested in the mean. Rather we are interested in the first 2 percentile point – our target for the warranty period.
Stating reliability clearly helps. Helping others understand the meaning of MTBF also helps.
Go be useful!
Don (Maintenance Training) says
Good one Fred. It can never be stressed enough, MTBF is only useful if the environment is well defined. IE: “In what environment should it work?”
Fred Schenkelberg says
Hi Don,
Thanks for the note and yes one has to be very clear when using MTBF – and even then given the way others mis understand it – I recommend not using MTBF at all.
Cheers,
Fred
Marcelo says
Hi Fred!
Thanks for your effords to spread your knowledge about NO MTBF.
I have a question, maybe was a “typing error”
In the article is written:
“Which is 67% of units would be expected to fail.”
Is this sentence correct? or should be re-written:
“Which is 67% of units would be expected to survive” ???
I appreciatte your feedback.
Regards
Marcelo Reyes from Mexico
Fred Schenkelberg says
Hi Marcelo good to hear from you.
Thanks, you are right. I was using the reliability function and found that 67% survive.
I’m so use to using the CDF to show that 2/3rd fail at the time (duration) equal to the MTBF, that I missed that detail.
And, thanks for the support and taking time to help improve the article.
Cheers,
Fred
Kishore says
Hi;
I am looking for a correct way to explain my client that there is something wrong with it. Would appreciate your help on how do you approach the below problem.
There is a an electromechanical system which consists of motors, relays, switches, limit switches, huge mechanical structural parts etc (a monorail beam switch if you know) …consider around 100+ equipments. The client has asked an MCBF (since it operates in cycle) of 1,000,000 for the whole system. How do you react to this requirement.
Fred Schenkelberg says
HI Kishore,
Great question and glad to know you’re working to help others understand reliability.
I wrote a post (my reply was getting long) which is here. http://nomtbf.com/2014/07/mtbf-requirement-reaction/