Accendo Reliability

Your Reliability Engineering Professional Development Site

  • Home
  • About
    • Contributors
  • Reliability.fm
    • Speaking Of Reliability
    • Rooted in Reliability: The Plant Performance Podcast
    • Quality during Design
    • Way of the Quality Warrior
    • Critical Talks
    • Dare to Know
    • Maintenance Disrupted
    • Metal Conversations
    • The Leadership Connection
    • Practical Reliability Podcast
    • Reliability Matters
    • Reliability it Matters
    • Maintenance Mavericks Podcast
    • Women in Maintenance
    • Accendo Reliability Webinar Series
  • Articles
    • CRE Preparation Notes
    • on Leadership & Career
      • Advanced Engineering Culture
      • Engineering Leadership
      • Managing in the 2000s
      • Product Development and Process Improvement
    • on Maintenance Reliability
      • Aasan Asset Management
      • AI & Predictive Maintenance
      • Asset Management in the Mining Industry
      • CMMS and Reliability
      • Conscious Asset
      • EAM & CMMS
      • Everyday RCM
      • History of Maintenance Management
      • Life Cycle Asset Management
      • Maintenance and Reliability
      • Maintenance Management
      • Plant Maintenance
      • Process Plant Reliability Engineering
      • ReliabilityXperience
      • RCM Blitz®
      • Rob’s Reliability Project
      • The Intelligent Transformer Blog
      • The People Side of Maintenance
      • The Reliability Mindset
    • on Product Reliability
      • Accelerated Reliability
      • Achieving the Benefits of Reliability
      • Apex Ridge
      • Metals Engineering and Product Reliability
      • Musings on Reliability and Maintenance Topics
      • Product Validation
      • Reliability Engineering Insights
      • Reliability in Emerging Technology
    • on Risk & Safety
      • CERM® Risk Insights
      • Equipment Risk and Reliability in Downhole Applications
      • Operational Risk Process Safety
    • on Systems Thinking
      • Communicating with FINESSE
      • The RCA
    • on Tools & Techniques
      • Big Data & Analytics
      • Experimental Design for NPD
      • Innovative Thinking in Reliability and Durability
      • Inside and Beyond HALT
      • Inside FMEA
      • Integral Concepts
      • Learning from Failures
      • Progress in Field Reliability?
      • R for Engineering
      • Reliability Engineering Using Python
      • Reliability Reflections
      • Testing 1 2 3
      • The Manufacturing Academy
  • eBooks
  • Resources
    • Accendo Authors
    • FMEA Resources
    • Feed Forward Publications
    • Openings
    • Books
    • Webinars
    • Journals
    • Higher Education
    • Podcasts
  • Courses
    • 14 Ways to Acquire Reliability Engineering Knowledge
    • Reliability Analysis Methods online course
    • Measurement System Assessment
    • SPC-Process Capability Course
    • Design of Experiments
    • Foundations of RCM online course
    • Quality during Design Journey
    • Reliability Engineering Statistics
    • Quality Engineering Statistics
    • An Introduction to Reliability Engineering
    • Reliability Engineering for Heavy Industry
    • An Introduction to Quality Engineering
    • Process Capability Analysis course
    • Root Cause Analysis and the 8D Corrective Action Process course
    • Return on Investment online course
    • CRE Preparation Online Course
    • Quondam Courses
  • Webinars
    • Upcoming Live Events
  • Calendar
    • Call for Papers Listing
    • Upcoming Webinars
    • Webinar Calendar
  • Login
    • Member Home

by Adam Bahret Leave a Comment

Why can’t we shake off MTBF?

Why can’t we shake off MTBF?

Mean Time Between Failure (MTBF) is one of the most well know reliability metrics.  COH5YDHUkAEJCyK

But to anyone who works with reliability, it seems like it was developed by some evil anti-reliability mastermind to undermine the possibility of connecting reliability to anything or anyone.

 

Mean Time Between Failure means what?

  • It’s the time between two failures? –
  • It’s when the first failure occurs? –
  • It’s how long the product is good for?
  • It seems way to big to be a reasonable goal! “How can an air pump have an MTBF of 4 million hours?  That’s ridiculous these things are only supposed to last for five years!

This is the process of understanding everyone goes through as they are introduced to MTBF, formally or informally.

There are other ways to communicate the parameter that MTBF represents, failure rate is simply the inverse of MTBF.  Why don’t we use failure rate?  A 2.5 million hr MTBF is the equivalent to 1/2.5 Mil = 0.0000004 fails/hr

Well I guess that is our answer.  What the hell does that mean? 0.0000004 fails/hr?  That means nothing to me, I have no idea if that is good or bad.  I at least know what an hour is when we talk MTBF.

We can’t use % reliability as a direct replacement because % reliability needs a period of time the statement is over.  90% reliability over 2 years.  MTBF and failure rate do not have to include a period of time to be defined so can be easily translated to specific metrics that do involve a products behavior in a specific time period.

Are those our only three options?  MTBF, Failure rate, and %reliability/unreliability (if we have a time period to express it over).

I think that is part of why we seem stuck in MTBF.

So let’s discuss what MTBF actually is, because for now we are kinda stuck with it.  (Shaking fist at our nemisis evil MTBF mastermind “You win this time Dr MTBF!!!”)

MTBF is most commonly used to describe the reliability of a design during it’s intended use life.

MTBF is when approximately half* of the population has failed during use life.  Failures that occur due to infant mortality or wear-out are not included in this metric.

*(I said approx half because it depends on the applied distribution.  If it is an exponential distribution the MTBF is when 63.2% have failed)

So MTBF is a really bad point in the product’s history, a lot has gone wrong.  Half* of the products have failed in the customer’s hands.  Few things to consider here.  Failures that are classified as infant mortality are not included.  When a product hit’s it’s designated end of use-life (hopefully before a wear-out failure) it is removed from the population and replaced with a new one.  So the population we are measuring the fail rate for is continuously in this rotation.

Here’s a statistic that emphasizes how much MTBF is not intuitive.

What is the MTBF of a human?   Over 800 years!  The surprise most of us have with that answer highlights the misconceptions as to what it represents.

We all know that the likely “use-life” of a human is around 65 years.  Basically, on average “wear-out” based failure modes are going to become more dominate at this point.

If we then take the MTBF number to be representative as to the failure rate during use life, it would be attributed to random accidents and illness that are fatal.  We will assume that they, accident/illness, are random which indicates we will use the exponential distribution to represent use life. So by definition when the time equals MTBF 63.2% of the population has failed from random accidents

So by definition when the time equals MTBF 63.2% of the population has failed from random accidents and illness during use life.  Each person is replaced by a new one that is past infant mortality and has not run to wear-out (retirement).  So this is sounding a lot like measuring if an employee is going to die.  You don’t’ care about the status of children who can’t work and people who have retired.  Your question is “How often will an employee call in “dead” and I have to hire a new one”.  So if it is a company with 20,000 employees that means that at time=MTBF, 63.2% have died while in your employment.  That is 12,6400 employees that have died from random accident or illness?  All of a sudden 800 years is starting to sound about right.  Imagine a town with 20,000 people.  At what point in time would their cemetery have over 12 thousand graves that were only from people between 13 and 65?  All the infant, children, and elder deaths are in a separate cemetery.  800 years.

But all that explanation doesn’t make the sound of “A human has an MTBF of 800 years” sound any less ridiculous.

So you win this time Dr Evil!!!        But someday we will create a better metric…some day.

Laughing-Villains

If you need any therapy on this topic please head over to www.nomtbf.com  It’s a great little community created by Fred Schenkelberg.  He’s gone so far as to create “No MTBF” buttons that people can wear so we can find each other at conferences and events to comfort each other in person.

-Adam

Filed Under: Apex Ridge, Articles, on Product Reliability

About Adam Bahret

I am a Reliability engineer with over 20 years of experience in mechanical and electrical systems in many industries. I founded Apex Ridge Reliability as a firm to assist technology companies with the critical reliability steps in their product development programs and organizational culture.

« How to Judge a Reliability Book
The Dangers of RCM Shortcuts! »

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Article by Adam Bahret
in the Apex Ridge series

Join Accendo

Receive information and updates about articles and many other resources offered by Accendo Reliability by becoming a member.

It’s free and only takes a minute.

Join Today

Recent Articles

  • test
  • test
  • test
  • Your Most Important Business Equation
  • Your Suppliers Can Be a Risk to Your Project

© 2025 FMS Reliability · Privacy Policy · Terms of Service · Cookies Policy