If not MTBF, then what should we use instead?

#163560667 / gettyimages.com

MTBF has issues. It is commonly mis-understood and mis-used. I find it hard to interpret and use for any meaningful discussion of reliability.

The entire premise of the NoMTBF site is to encourage you to not use MTBF.

There are exhaustive writings on setting meaningful goals and metrics in the business literature. A couple of tenants seem common:

Measurable
Meaningful
Useful for decisions

MTBF meets one of three and we can do better.

How about using reliability?

In engineering terms reliability is

The probability an item will function as expected over some duration without failure in a specific environment.

There are four elements: Function, environment, probability and duration.

MTBF is just the probability (if you are thinking it’s a duration, please read the perils article). We often see MTBF without a duration and only assumed function and environment.

Reliability as a goal and metric

1. Reliability is measurable.

For a given function(s) and environments we can count failures or analyze the time to failure data to determine the probability of successful operation over a duration. Probability and duration are a couplet and should always be stated together. 95% survive 1 year, 90% operate for 10 years, for example.

2. Reliability is meaningful.

If we include all four elements, we have a common basis for function (and when the function is not successful, a definition for failure), a common understanding of the environment, and basic idea of probability and duration.

If we say or measure an item which has a 95% probability of survival over 1 year, It should be clear that our common understanding is:

A. 95 out of 100 units are expected to operate (not fail) over the duration of 1 year.

B. There is a 95 our of 100 chance that an item will operate successfully for 1 year.

Not much of a calculation needed to understand. If you’d rather dwell on failures, the same statements are true with the complement of probability, we expect 5% failures over 1 year, for example, (100% – 95% = 5%).

3. Useful for decisions.

The reasons we set goals and create measures is to make decisions. Basic decisions for reliability include:

Is the product reliable enough?

Is the rate of failures higher than expected?

Did we estimate warranty expenses accurately?

Will this item function long enough?

The reliability metric outlined here does not mask the changing rate of failure that may occur. Using a duration that is meaningful for the decision at hand, you can compare expectations to measured (predicted) performance.

Since the metric is easily understood the chance of mis-understanding is avoided.

I like it.

You can always add red, yellow, green, colored boxes to convey suitability of a specific reliability value, yet that may not be necessary.