Accendo Reliability

Your Reliability Engineering Professional Development Site

  • Home
  • About
    • Contributors
  • Reliability.fm
    • Speaking Of Reliability
    • Rooted in Reliability: The Plant Performance Podcast
    • Quality during Design
    • Way of the Quality Warrior
    • Critical Talks
    • Dare to Know
    • Maintenance Disrupted
    • Metal Conversations
    • The Leadership Connection
    • Practical Reliability Podcast
    • Reliability Matters
    • Reliability it Matters
    • Maintenance Mavericks Podcast
    • Women in Maintenance
    • Accendo Reliability Webinar Series
  • Articles
    • CRE Preparation Notes
    • on Leadership & Career
      • Advanced Engineering Culture
      • Engineering Leadership
      • Managing in the 2000s
      • Product Development and Process Improvement
    • on Maintenance Reliability
      • Aasan Asset Management
      • AI & Predictive Maintenance
      • Asset Management in the Mining Industry
      • CMMS and Reliability
      • Conscious Asset
      • EAM & CMMS
      • Everyday RCM
      • History of Maintenance Management
      • Life Cycle Asset Management
      • Maintenance and Reliability
      • Maintenance Management
      • Plant Maintenance
      • Process Plant Reliability Engineering
      • ReliabilityXperience
      • RCM Blitz®
      • Rob’s Reliability Project
      • The Intelligent Transformer Blog
      • The People Side of Maintenance
      • The Reliability Mindset
    • on Product Reliability
      • Accelerated Reliability
      • Achieving the Benefits of Reliability
      • Apex Ridge
      • Metals Engineering and Product Reliability
      • Musings on Reliability and Maintenance Topics
      • Product Validation
      • Reliability Engineering Insights
      • Reliability in Emerging Technology
    • on Risk & Safety
      • CERM® Risk Insights
      • Equipment Risk and Reliability in Downhole Applications
      • Operational Risk Process Safety
    • on Systems Thinking
      • Communicating with FINESSE
      • The RCA
    • on Tools & Techniques
      • Big Data & Analytics
      • Experimental Design for NPD
      • Innovative Thinking in Reliability and Durability
      • Inside and Beyond HALT
      • Inside FMEA
      • Integral Concepts
      • Learning from Failures
      • Progress in Field Reliability?
      • R for Engineering
      • Reliability Engineering Using Python
      • Reliability Reflections
      • Testing 1 2 3
      • The Manufacturing Academy
  • eBooks
  • Resources
    • Accendo Authors
    • FMEA Resources
    • Feed Forward Publications
    • Openings
    • Books
    • Webinars
    • Journals
    • Higher Education
    • Podcasts
  • Courses
    • 14 Ways to Acquire Reliability Engineering Knowledge
    • Reliability Analysis Methods online course
    • Measurement System Assessment
    • SPC-Process Capability Course
    • Design of Experiments
    • Foundations of RCM online course
    • Quality during Design Journey
    • Reliability Engineering Statistics
    • Quality Engineering Statistics
    • An Introduction to Reliability Engineering
    • Reliability Engineering for Heavy Industry
    • An Introduction to Quality Engineering
    • Process Capability Analysis course
    • Root Cause Analysis and the 8D Corrective Action Process course
    • Return on Investment online course
    • CRE Preparation Online Course
    • Quondam Courses
  • Webinars
    • Upcoming Live Events
  • Calendar
    • Call for Papers Listing
    • Upcoming Webinars
    • Webinar Calendar
  • Login
    • Member Home

by nomtbf Leave a Comment

How to Calculate MTBF

How to Calculate MTBF

Considerations When You Calculate MTBF

You should calculate MTBF for machines tooIt is deceptively easy to calculate MTBF given a count of failure and an estimate of operating hours. Just tally up the total hours the various systems operate and divide by the number of failures. Easy.

This simple calculation is the unbiased estimator for the inverse of the parameter lambda for the exponential distribution, or directly to estimate theta (MTBF). We use theta to represent the 1 / lambda.

What could go wrong with such a simple calculation?

What is a failure?

Let’s start with what we count or do not count as a failure. This directly changes the resulting MTBF value. If we only count confirmed hardware failures, and do not count intermittent or unreproducible or software failures, are we under counting what the customer experiences as a failure?

Over what duration do we count the failures? Should we focus only on the first month of operation, the first year, the warranty or service contract period or the entire operating life of the system? How do you calculate MTBF?

Some organizations only count failures they expect to occur. The unexpected ones are ‘special’ causes and require further study before counting as failure officially.

Another organization only counted failures that completely shut down the system. A partial loss of functionality, a degradation of capability or the failure of a redundant element all did not count a system failure.

In my opinion if the customer calls it a failure, it’s a failure. If a failure, by any definition, costs your organization time and money to address, acknowledge, resolve or repair, it’s a failure.

What is operating time?

This one is tricky. If the system does include the appropriate sensors and tracking mechanisms (hour meter) and a way to gather that operating time of units both failed and still operating, then we have a pretty good way to track total operating hours. Some situations and systems make this easy.

Most do not.

Let’s say we ship 100 systems a month for 10 months. At the end of ten months the first shipments have accumulated 10 months of operating time. IF….

… They are all placed into service immediately

… They are all operated full time for the full 10 months

… They are have each failure reported including down time

In general, we do have to make a few assumptions to determine the operating time for shipped systems. We tend to be conservative and err on the side that would make the MTBF value a little smaller than if we had the full set of carefully tracked data. Or do we?

  • Some organization count from date/time of shipment ignoring shipping and installation time.
  • Some organization assume all systems are installed and operated 24/7.
  • Some organization assume no news is good news and the systems with no information are still operating.

And a few organization assume systems run indefinitely, even systems 20 years old, unless notified that it is decommissioned, assume it is still running full tilt. i.e. No retirement or replacement policy.

How about when you calculate MTBF?

By convention when there are no failures we assume in the next instant there will be one failure. This avoid dividing by zero which causes fits for calculators and spreadsheets and mathematicians.

Another issue is how often are the calculations made? Do we gather data hourly, daily, weekly, monthly, annually? Some use a rolling set of data, for example only units shipped in the last year count for both operating time and failures. This result will ignore or discount the longer term wear out failures as the bulk of the units are young.

Some organization do the calculations weekly in order to detect trends. If there are trends you probably should not be using MTBF…. If it’s changing, if there are early life or wear out failure mechanisms, you should not be using MTBF.

Even though you can calculate MTBF easily, the complexities of getting it right still do not provide a useful metric. Instead focus on getting better data including time to failure information so you can explore and report the data with other tools and methods. Treat the data appropriately and make better decisions

Sure, better data will improve the ability to calculate the MTBF value, if you’d like to be like some organizations, that is fine.

How have you seen MTBF calculated poorly? Share your thoughts and stories in the comments below.

Related:

How to calculate MTTF

Perils of using MTBF

Filed Under: Articles, NoMTBF

« In a World of “Black Swans”, How Do You Know Which One to Worry About
Basic Description of a Fault Tree Analysis »

Comments

  1. Kevin Walker says

    June 6, 2016 at 1:58 PM

    We calculate our MTBF / MTTF using only the confirmed failures, since those most often represent the ones most likely to be design controllable. That said, we and our customer also monitor the MTBUR (Mean time between unscheduled removals) which reflects the pain the end user feels, including installation and handling damage, the NFF count, and other causes that may need attention but aren’t strictly the functional failures. Thanks to more and better data collection gong on in the commercial aero industry, better fleet utilization data is available, so we are using Weibull analysis to calculate MTTF, since the beta is a better indicator of where to go looking for cause or causes, and the shape of the plotted data can also hold some clues.

    Reply
    • Fred Schenkelberg says

      June 6, 2016 at 2:57 PM

      Hi Kevin,
      thanks for the note. So, if a failure is not confirmed it is assumed out of the design’s ability to control? Doesn’t that leave a large area for intermittent failures to lurk, which often are directly able to be designed out if desired?

      And, if using Weibull analysis, why bother to calculate MTTF, why strip your data of information going from a Weibull CDF for example, or % surviving so many airmails, to MTTF… seems a waste of good data.

      Cheers,

      Fred

      Reply
      • Kevin Walker says

        June 7, 2016 at 2:10 PM

        We’re on the same path, I just didn’t put the full story into first message. I can’t count failures if I don’t know what failed. In many cases, NFF’s get additional scrutiny like a run thru production ESS, a hot soak or a vibe test to try and get the failure to recur. We also recognize that sometimes troubleshooting at next level is a shotgun approach and good parts are removed so NFF is not always an intermittent or lurking condition, it’s an expected outcome when the part gets back to us.
        As far as why I calculate a MTTR from Weibull, it’s because it’s what was expected or requested, not because I like it any more than you. That’s why I also report a time to 1% failures along with it, because that’s really what they wanted to know, they just didn’t know to ask for it.
        Cheers,
        Kevin

        Reply
        • Fred Schenkelberg says

          June 9, 2016 at 12:11 PM

          Thanks Kevin and you know I like the idea of reporting the way you describe.. MTBF or MTTF along with time to first percentile or similar useful information. Cool! cheers, Fred

          Reply
  2. vetrivel says

    October 3, 2016 at 4:18 AM

    how to find the MTBF value for Hour meter( Part.no: 20018). otherwise give me Equivalent formula for find MTBF for Hour meter

    Reply
    • Fred Schenkelberg says

      October 3, 2016 at 8:57 AM

      First off, do not use MTBF to describe the reliability of the part. You can ask the vendor, yet better to ask for reliability information instead. You can calculate MTBF by dividing the total time by the number of failures… which is typically not very useful.

      Cheers,

      Fred

      Reply
  3. Zachary B says

    October 26, 2018 at 5:11 AM

    I am curious on how to calculate MTBF for a day when there are no stops. We currently use a system called plant focus which automatically calculates our MTBF and Reliability coefficient. We track this daily, rolling 30 day, and rolling year. If we have a do with no failures then the MTBF is calculated as 0 by the software. However this 0 will bring down our overall 30 day and yearly. This does not make sense as we are looking to increase our MTBF and not having a failure should not decrease this measurement.

    Reply
    • Fred Schenkelberg says

      October 26, 2018 at 10:06 AM

      Hi Zachary,

      One of many reasons to never use MTBF.

      By convention, you should divide by 1 when you have no failures over some time period.

      Actually, if you want to avoid the issue entirely, and make the use of MTBF rather more useless, just use total time from some point in time in the distant pass over which you have a count of failures…. the same effect of a rolling average and if you have enjoyed at least one failure then you can always avoid the divide by zero issue.

      Use reliability or operational available directly and skip using MTBF. For repairable systems you may also want to consider using Mean Cumulative Function instead.

      Cheers,

      Fred

      Reply
    • Kevin Walker says

      October 26, 2018 at 11:51 AM

      Another way to handle it would be to invert your metric from MTBF to failure rate, and track that failures divided by hours. Smaller is better, and your zero is the ideal condition. It can stay zero for any length of time and the calculation and charting still will be correct. It’s a mindset change from bigger is better to wanting small, but it could work.

      Reply
  4. Haidar Ali says

    May 9, 2021 at 2:04 AM

    Hi. I wanna ask, the equation of MTBF is running hours/number of failures. If some equipment did not failed, so how about the MTBF value? Is it infinity?

    Reply
    • Fred Schenkelberg says

      July 15, 2021 at 6:31 AM

      given the problem with dividing by zero – the convention is to divide by one instead – assuming the first failure will occur in the next instance. cheers, Fred

      Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

[popup type="" link_text="Get Weekly Email Updates" link_class="button" ]

[/popup]

The Accendo Reliablity logo of a sun face in circuit

Please login to have full access.




Lost Password? Click here to have it emailed to you.

Not already a member? It's free and takes only a moment to create an account with your email only.

Join

Your membership brings you all these free resources:

  • Live, monthly reliability webinars & recordings
  • eBooks: Finding Value and Reliability Maturity
  • How To articles & insights
  • Podcasts & additional information within podcast show notes
  • Podcast suggestion box to send us a question or topic for a future episode
  • Course (some with a fee)
  • Largest reliability events calendar
  • Course on a range of topics - coming soon
  • Master reliability classes - coming soon
  • Basic tutorial articles - coming soon
  • With more in the works just for members
Speaking of Reliability podcast logo

Subscribe and enjoy every episode

RSS
iTunes
Stitcher

Join Accendo

Receive information and updates about podcasts and many other resources offered by Accendo Reliability by becoming a member.

It’s free and only takes a minute.

Join Today

Dare to Know podcast logo

Subscribe and enjoy every episode

RSS
iTunes
Stitcher

Join Accendo

Receive information and updates about podcasts and many other resources offered by Accendo Reliability by becoming a member.

It’s free and only takes a minute.

Join Today

Accendo Reliability Webinar Series podcast logo

Subscribe and enjoy every episode

RSS
iTunes
Stitcher

Join Accendo

Receive information and updates about podcasts and many other resources offered by Accendo Reliability by becoming a member.

It’s free and only takes a minute.

Join Today

Recent Articles

  • test
  • test
  • test
  • Your Most Important Business Equation
  • Your Suppliers Can Be a Risk to Your Project

© 2025 FMS Reliability · Privacy Policy · Terms of Service · Cookies Policy