Accendo Reliability

Your Reliability Engineering Professional Development Site

  • Home
  • About
    • Contributors
  • Reliability.fm
    • Speaking Of Reliability
    • Rooted in Reliability: The Plant Performance Podcast
    • Quality during Design
    • Way of the Quality Warrior
    • Critical Talks
    • Dare to Know
    • Maintenance Disrupted
    • Metal Conversations
    • The Leadership Connection
    • Practical Reliability Podcast
    • Reliability Matters
    • Reliability it Matters
    • Maintenance Mavericks Podcast
    • Women in Maintenance
    • Accendo Reliability Webinar Series
  • Articles
    • CRE Preparation Notes
    • on Leadership & Career
      • Advanced Engineering Culture
      • Engineering Leadership
      • Managing in the 2000s
      • Product Development and Process Improvement
    • on Maintenance Reliability
      • Aasan Asset Management
      • AI & Predictive Maintenance
      • Asset Management in the Mining Industry
      • CMMS and Reliability
      • Conscious Asset
      • EAM & CMMS
      • Everyday RCM
      • History of Maintenance Management
      • Life Cycle Asset Management
      • Maintenance and Reliability
      • Maintenance Management
      • Plant Maintenance
      • Process Plant Reliability Engineering
      • ReliabilityXperience
      • RCM Blitz®
      • Rob’s Reliability Project
      • The Intelligent Transformer Blog
      • The People Side of Maintenance
      • The Reliability Mindset
    • on Product Reliability
      • Accelerated Reliability
      • Achieving the Benefits of Reliability
      • Apex Ridge
      • Metals Engineering and Product Reliability
      • Musings on Reliability and Maintenance Topics
      • Product Validation
      • Reliability Engineering Insights
      • Reliability in Emerging Technology
    • on Risk & Safety
      • CERM® Risk Insights
      • Equipment Risk and Reliability in Downhole Applications
      • Operational Risk Process Safety
    • on Systems Thinking
      • Communicating with FINESSE
      • The RCA
    • on Tools & Techniques
      • Big Data & Analytics
      • Experimental Design for NPD
      • Innovative Thinking in Reliability and Durability
      • Inside and Beyond HALT
      • Inside FMEA
      • Integral Concepts
      • Learning from Failures
      • Progress in Field Reliability?
      • R for Engineering
      • Reliability Engineering Using Python
      • Reliability Reflections
      • Testing 1 2 3
      • The Manufacturing Academy
  • eBooks
  • Resources
    • Accendo Authors
    • FMEA Resources
    • Feed Forward Publications
    • Openings
    • Books
    • Webinars
    • Journals
    • Higher Education
    • Podcasts
  • Courses
    • 14 Ways to Acquire Reliability Engineering Knowledge
    • Reliability Analysis Methods online course
    • Measurement System Assessment
    • SPC-Process Capability Course
    • Design of Experiments
    • Foundations of RCM online course
    • Quality during Design Journey
    • Reliability Engineering Statistics
    • Quality Engineering Statistics
    • An Introduction to Reliability Engineering
    • Reliability Engineering for Heavy Industry
    • An Introduction to Quality Engineering
    • Process Capability Analysis course
    • Root Cause Analysis and the 8D Corrective Action Process course
    • Return on Investment online course
    • CRE Preparation Online Course
    • Quondam Courses
  • Webinars
    • Upcoming Live Events
  • Calendar
    • Call for Papers Listing
    • Upcoming Webinars
    • Webinar Calendar
  • Login
    • Member Home

by nomtbf Leave a Comment

Replace After MTTF Time To Avoid Failures – Right?

Replace After MTTF Time To Avoid Failures – Right?

MTTF and maintenanceReplace After MTTF Time To Avoid Failures – Right?

Received a short question last week. The person writing seems to already know the answer, yet asked:

If we replace an item after a duration equal to the MTTF value, we would avoid failures, right?

Well, no, most likely not, was my response. What is your response? How would you answer this question? [Read more…]

Filed Under: Articles, NoMTBF

by nomtbf Leave a Comment

The Relationship Between Reliability Goals and Confidence

The Relationship Between Reliability Goals and Confidence

14803836443_5a40e52835_oReliability Goal and Confidence

We establish reliability goals and measure reliability performance.

They are not the same thing. Goals and measures, while related, are not the same nor serve the same purpose. [Read more…]

Filed Under: Articles, NoMTBF

by nomtbf Leave a Comment

Yet Another Confused MTBF Definition

Yet Another Confused MTBF Definition

Wonder what the MTBF calculation was for this equipmentYet Another Confused MTBF Definition

Just when I thought we had experienced every possible MTBF definition confusion, here’s another.

This one is courtesy the thread concerning the impact to reliability when adding redundancy to a system. [Read more…]

Filed Under: Articles, NoMTBF

by nomtbf Leave a Comment

Another Way to Spot Someone Confusing MTBF

Another Way to Spot Someone Confusing MTBF

Vintage machine image, without confusing MTBFYet Another Way to Misunderstand MTBF

In a Q&A forum, the response to a question concerning failure rate and repair times for a redundant system demonstrated yet another person confusing MTBF with something it is not.

The responder to the question mentioned the reference to repair time implied the need for MTBF as a metric. Then went on to describe MTBF as the duration of repair time, which should not change given a redundant system over a non-redundant system. [Read more…]

Filed Under: Articles, NoMTBF

by nomtbf Leave a Comment

Bought a House Due to Pokemon Go

Reliability and Pokemon GoWalking, Playing and Bought a House

Seriously, while out walking, listening to a podcast, and playing Pokemon Go, found an open house to view. A week later our offer was accepted and next week we close.

I  would not have been out walking that Sunday afternoon if not out playing Pokemon Go.

Glad there are no dangerous cliffs nearby. [Read more…]

Filed Under: Articles, NoMTBF

by nomtbf Leave a Comment

3 Ways to Improve your Reliability Program

The reliability performance of equipment is a reflection of your reliability programA Few Simple Ideas to Improve Your Reliability Program

Spending too much on reliability and not getting the results you expect? Just getting started and not sure where to focus your reliability  program? Or, just looking for ways to improve your program?

There is not one way to build an effective reliability program. The variations in industries, expectations, technology, and the many constraints, shape each program. Here are three suggestions you can apply to any program at any time. These are not quick fix solutions, nor will you see immediate results, yet each will significantly improve your reliability program and help you achieve the results you and your customers expect. [Read more…]

Filed Under: Articles, NoMTBF

by nomtbf Leave a Comment

5 Clues Using MTBF is Not Helping

Old adverting didn't use MTBF, why do you use it now?5 Clues Using MTBF is Not Helping

Have you ever heard the claim that “We use MTBF, as it’s working just fine”?

They may be profitable and successful in the marketplace. Is MTBF serving them well?

Probably not. [Read more…]

Filed Under: Articles, NoMTBF

by nomtbf Leave a Comment

What is Reliability?

What is Reliability?

14784844872_7b7908dd94_zGuest Post by Martin Shaw

In today’s complex product environment becoming more and more electronic, do the designers and manufacturers really understand what IS Reliability ??

It is NOT simply following standards to test in RD to focus only on Design Robustness as there is too much risk in prediction confidence, it only deals with the ‘intrinsic’ failure period and rarely has sufficient Test Strength to stimulate failures. [Read more…]

Filed Under: Articles, NoMTBF

by nomtbf Leave a Comment

Failure Happens – It Is What Happens Next That Matters

Failure Happens – It Is What Happens Next That Matters

When a failure happens with our equipment our measured response mattersFailure Happens – It Is What Happens Next That Matters

One of the benefits of reliability engineering is failure happens.

Nothing made, manufactured, or assembled will not fail at some point. It is our desire to have items last long enough that keeps us working. Since failures happen, our work includes dealing with the failure.

Not My Fault

Years ago while preparing samples for life testing at my bench, I heard an ‘eep’ or a startled sound from a fellow engineer. It was quickly followed by an electrical pop noise and a plum of smoke.

Something on the circuit board she was exploring had failed. With a pop and smoke. She didn’t move.

At this point, my initial amused response turned to concern for her safety. She was fine, just startled as the failure was unexpected. She quickly claimed it wasn’t her fault.

It was her design, she selected and assembled the parts, and she was testing the circuit. Yet, it wasn’t her fault. She did not expect a failure to occur (a blown capacitor – which we later discovered was exposed to far too much voltage), thus it was not her fault.

We hear similar responses from suppliers of components. It must have been something in your design or environment that caused the failure, as the failure described shouldn’t happen. It’s not expected.

Well, guess what, it did happen. Now let’s sort out what happened and not immediate assign blame for who’s fault it is.

The ‘not my fault’ response so a failure is not helpful. Failures are sometimes the result of a simple error and quickly remedied. Other are complex and difficult to unravel. The quicker we focus on solving the mystery of the cause of the failure, the quicker we can move on to making improvements.

Warranty

With possibly too many ‘not my fault’ responses, laws now enjoin the manufacture of products to stand behind their product. If a failure occurs, sometimes within specific conditions, the customer may ask for a remedy from the supplier.

If failures did not happen there would be no such thing as a warranty.

A warranty is actually a legal obligation, yet has turned into a marketing tool. A long warranty implies the product is reliable and by offering a long warranty the manufacturer is stating they are shifting the risk of failure to themselves.

A repair or replacement is generally not adequate recompense for a failure, yet it provides some restitution. In most cases, it only provides peace of mind, if the item doesn’t fail.

The warranty business has become an industry in of itself. Selling, servicing, and honoring warranties is something that others can deal with outside your organization. The downside is the lack of feedback about failure details so you can affect improvements. A manufacturer shouldn’t hide behind their warranty policy, nor ignore the warranty claim details. It is one-way a customer can voice their expectations concerning product reliability. You should listen.

Repair services

My favorite outsourced repair service story involved a misguided payment structure.

If you pay a repairman based on the value of the components replaced, they will likely always replace the most expensive components. If the repair is accomplished by resetting a loose connector, nothing is replaced, and the repairman is not compensated for the diagnostic work and effective repair. If he instead immediately replaced the main circuit board, and in the process reseating most of the connectors, the repair is fast, effective, and he is handsomely rewarded.

See the problem?

When a failure occurs, it may be natural to offer a repair service as the remedy. It should be quick (not a two-week wait as with my local cable company to restoring a fallen line), and efficient for all parties involved. For the owner of the equipment, we want the functionality restored as quickly as possible and cost effectively as possible. For the manufacture of the equipment, we want cost effectiveness, plus the knowledge concerning the failure.

Does your repair service provide for the needs of both parties as well as the repair technician?

Fail safe

Sometimes when a failure occurs nothing happens. We might not even notice the failure occurs. Other times the product simply goes ‘cold’ or a function is lost. Nothing adverse, no pop or smoke, occurs.

We call this failing safe. It’s more complicated than my simple explanation, yet it is the desired repose to a failure. The product itself should not create more damage, cause harm, place someone in peril. It should fail safely and preferably quietly.

If the ignition falls from the ignition switch, which may be considered a failure to retain the key within the switch, the driver should not lose control of the vehicle. This is in part a safety feature, yet is also a common expectation that the failure of a system should not create other problems.

Failure containment is related.

How does your product fail? Safely?

Maintenance

For some failures, such as the degradation of lubricants, we perform maintenance. When the brake pads or tire tread wears to marginally safe level we replace the brake pad or tire. If we can anticipate the failure pattern we perform preventive maintenance.

Creating a maintainable piece of equipment is one response to failures. It allows creating complex equipment with failure prone elements. Through maintenance, we are able to restore the system to operation or avoid unexpected downtime. If failures didn’t occur, we wouldn’t need maintenance.

We have some control over the nature of the maintenance activities. For some types of failures, we can only execute corrective maintenance. For others, we can use preventative methods. The idea is to anticipate and avoid the widest range of failures through effective maintenance practices, that remains cost effective.

Adding maintenance practices in response to system failures is not the duty of the owner of the equipment. It is a design function to anticipate the system failures that may occur and devise the appropriate maintenance plan to thwart unwanted failures from occurring. The two parties actually have to work together to make this work well.

Expectations

When I buy a product, I know that some proportion of products like the one I just purchased will failure prematurely. I just do not want or desire mine to fail. My expectation is the one I select at the store is a good one. It won’t let me down, stranded, or injured. That is my expectation.

When a failure does occur and I value the functionality the product provides I will want to restore the unit via repair or replacement, sometimes via a service contract or warranty or repair center. To a large degree, my expectation is after a failure all will go well.

As the manufacture of products, when a failure occurs, your expectations may include learning from the failure to make improvements. Or it should.

We know we cannot anticipant nor avoid every failure that may occur. The expectation on both sides is to make robust and dependable products that provide value for all involved. When that approach fails, we fail.

Failure Happens

In response to a failure, it’s how the product, customer, and manufacture responds that matters. A simple failure can turn into a disaster for all involved. Or the failure can provide insights leading to breakthrough innovations and new opportunities.

It’s how we respond that matters.

How do you respond to failures?

Filed Under: Articles, NoMTBF

by nomtbf Leave a Comment

Are the Measures Failure Rate and Probability of Failure Different?

Are the Measures Failure Rate and Probability of Failure Different?

Old machinery enjoyed a failure rate, which one though?Are the Measures Failure Rate and Probability of Failure Different?

Failure rate and probability are similar. They are slightly different, too.

One of the problems with reliability engineering is so many terms and concepts are not commonly understood.

Reliability, for example, is commonly defined as dependable, trustworthy, as in you can count on him to bring the bagels. Whereas, reliability engineers define reliability as the probability of successful operation/function within in a specific environment over a defined duration.

The same for failure rate and probability of failure. We often have specific data-driven or business-related goals behind the terms. Others do not.
If we do not state over which time period either term applies, that is left to the imagination of the listener. Which is rarely good.

Failure Rate Definition

There at least two failure rates that we may encounter: the instantaneous failure rate and the average failure rate. The trouble starts when you ask for and are asked about an item’s failure rate. Which failure rate are you both talking about?

The instantaneous failure rate is also known as the hazard rate h(t)

$latex \displaystyle&s=3 h\left( t \right)=\frac{f\left( t \right)}{R\left( t \right)}$

Where f(t) is the probability density function and R(t) is the relaibilit function with is one minus the cumulative distribution function. The hazard rate, failure rate, or instantaneous failure rate is the failures per unit time when the time interval is very small at some point in time, t. Thus, if a unit is operating for a year, this calculation would provide the chance of failure in the next instant of time.

This is not useful for the calculation of the number of failures over that year, only the chance of a failure in the next moment.

The probability density function provides the fraction failure over an interval of time. As with a count of failures per month, a histogram of the count of failure per month would roughly describe a PDF, or f(t). The curve described for each point in time traces the value of the individual points in time instantaneous failure rate.

Sometimes, we are interested in the average failure rate, AFR. Where the AFR over a time interval, t1 to t2, is found by integrating the instantaneous failure rate over the interval and divide by t2 – t1. When we set t1 to 0, we have

$latex \displaystyle&s=3 AFR\left( T \right)=\frac{H\left( T \right)}{T}=\frac{-\ln R\left( T \right)}{T}$

Where H(T) is the integral of the hazard rate, h(t) from time zero to time T,
T is the time of interest which define a time period from zero to T,
And, R(T) is the reliability function or probability of successful operation from time zero to T.

A very common understanding of the rate of failure is the calculation of the count of failures over some time period divided by the number of hours of operation. This results in the fraction expected to fail on average per hour. I’m not sure which definition of failure rate above this fits, and yet find this is how most think of failure rate.

If we have 1,000 resistors that each operate for 1,000 hours, and then a failure occurs, we have 1 / (1,000 x 1,000 ) = 0.000001 failures per hour.

Let’s save the discussion about the many ways to report failure rates, AFR (two methods, at least), FIT, PPM/K, etc.

Probability of Failure Definition

I thought the definition of failure rate would be straightforward until I went looking for a definition. It is with trepidation that I start this section on the probability of failure definition.

To my surprise it is actually rather simple, the common definition both in common use and mathematically are the same. There are two equivalent ways to phrase the definition:

  1. The probability or chance that a unit drawn at random from the population will fail by time t.
  2. The proportion or fraction of all units in the population that fail by time t.

We can talk about individual items or all of them concerning the probability of failure. If we have a 1 in 100 chance of failure over a year, then that means we have about a 1% chance that the unit we’re using will fail before the end of the year. Or it means if we have 100 units placed into operation, we would expect one of them to fail by the end of the year.

The probability of failure for a segment of time is defined by the cumulative distribution function or CDF.

When to Use Failure Rate or Probability of Failure

This depends on the situation. Are you talking about the chance to failure in the next instant or the chance of failing over a time interval? Use failure rate for the former, and probability of failure for the latter.

In either case, be clear with your audience which definition (and assumptions) you are using. If you know of other failure rate or probability of failure definition, or if you know of a great way to keep all these definitions clearly sorted, please leave a comment below.

Filed Under: Articles, NoMTBF Tagged With: Failure, Failure Rate

by nomtbf Leave a Comment

The Magic Math of Meeting MTBF Requirements

The Magic Math of Meeting MTBF Requirements

Even old machines met reliability or MTBF requirementsThe Magic Math of Meeting MTBF Requirements

Recently heard from a reader of NoMTBF. She wondered about a supplier’s argument that they meet the reliability or MTBF requirements. She was right to wonder.

Estimating reliability performance a new design is difficult.

There are good and better practice to justify claims about future reliability performance. Likewise, there are just plain poor approaches, too. Plus there are approaches that should never be used.

The Vendor Calculation to Support Claim They Meet Reliability Objective

Let’s say we contract with a vendor to create a navigation system for our vehicle. The specification includes functional requirements. Also it includes form factor and a long list of other requirements. It also clearly states the reliability specification. Let’s say the unit should last with 95% probability over 10 years of use within our vehicle. We provide environmental and function requirements in detail.

The vendor first converts the 95% probability of success over 10 years into MTBF. Claiming they are ‘more familiar’ with MTBF. The ignore the requirements for probability of first month of operation success. Likewise they ignore the 5 year targeted reliability, or as they would convert, MTBF requirements.

[Note: if you were tempted to calculate the equivalent MTBF, please don’t. It’s not useful, nor relevant, and a poor practice. Suffice it to say it would be a large and meaningless number]

RED FLAG By converting the requirement into MTBF it suggests they may be making simplifying assumptions. This may permit easier use of estimation, modeling, and testing approaches.

The Vendor’s Approach to ‘Prove’ The Meet the MTBF Requirement

The vendor reported they met the reliability requirement using the following logic:

Of the 1,000 (more actually) components we selected 6 at random for accelerated life testing. We estimated the lower 60% confidence of the probability of surviving 10 years given the ALT results. Then converted the ALT results to MTBF for the part.

We then added the Mil Hdbk 217 failure rate estimate to the ALT result for each of the 6 parts.

RED FLAG This one has me wondering the rationale for adding failure rates of an ALT and a parts count prediction. It would make the failure rate higher. Maybe it was a means to add a bit of margin to cover the uncertainty? I’m not sure, do you have any idea why someone would do this? Are they assuming the ALT did not actually measure anything relevant or any specific failure mechanisms, or they used a benign stress? ALT details were not provided.

The Approach Gets Weird Here

Then we use a 217 parts count prediction along with the modified 6 component failure rates to estimate the system failure rate, and with a simple inversion estimated the MTBF. They then claimed the system design will meet the field reliability performance requirements.

RED FLAG Mil HDBK 217 F in section 3.3 states

Hence, a reliability prediction should never be assumed to represent the expected field reliability …

If you are going to use a standard, any standard, one should read it. Read to  understand when and why it is useful or not useful.

What Should the Vendor Have Done Instead?

There are a lot of ways to create a new design and meet reliability requirements.

  • The build, test, fix approach or reliability growth approach works well in many circumstances.
  • Using similar actually fielded systems failure data. It may provide a reasonable bound for an estimate of a new system. It may also limit the focus on the accelerated testing to only the novel or new or high risk areas of the new design — given much of the design is (or may be) similar to past products.
  • Using a simple reliability block diagram or fault tree analysis model to assembly the estimates, test results, engineering stress/strength analysis (all better estimation tools then parts count, in my opinion) and calculate a system reliability estimate.
  • Using a risk of failure approach with FMEA and HALT to identify the likely failure mechanisms then characterize those mechanisms to determine their time to failure distributions. If there is one or a few dominant failure mechanisms, that work would provide a reasonable estimate of the system reliability.

In all cases focus on failure mechanisms and how the time to failure distribution changes given changes in stress / environment / use conditions. Monte Carlo may provide a suitable means to analysis a great mixture of data to determine an estimate. Use reliability, probability of success over a duration.

In short, do the work to understand the design, it’s weaknesses, the time to failure behavior under different use/condition scenarios, and make justifiable assumptions only when necessary.

Summary

We engage vendors to supply custom subsystems given their expertise and ability to deliver the units we need for our vehicle. We expect them to justify they meet reliability requirements in a rationale and defendable manner. While we do not want to dictate the approach tot he design or the estimate of reliability performance, we certainly have to judge the acceptability of the claims they meet the requirements.

What do you report when a customer asks if your product will meet the reliability requirements? Add to the list of possible approaches in the comments section below.

Related

How to Calculate MTBF

Questions to ask a vendor

MTBF: According to a Component Supplier

Filed Under: Articles, NoMTBF

by nomtbf Leave a Comment

Is Your Reliability Testing Adding Value?

Is Your Reliability Testing Adding Value?

14784053872_d85296bb8b_zWhy Do Reliability Testing

Reliability testing is expensive. The results are often not conclusive.

Yet we spend billions on environmental, accelerated, growth, step stress and other types of reliability tests. We bake, shake, rattle and roll prototypes and production units alike. We examine the collected data in hopes of glimpsing the future. [Read more…]

Filed Under: Articles, NoMTBF

by nomtbf Leave a Comment

Calculating System Availability

Calculating System Availability

Considering system availability is essential when designing complex equipmentHow to Properly Calculate System Availability

Recently received a request for my opinion concerning the calculation of system availability using the classic formula

$latex \displaystyle&s=4 A=\frac{MTBF}{MTBF+MTTR}$

The work is to create a set of goals for various suppliers and contractors to achieve. The calculation values derive from vendor data sheets and available information concerning MTBF and MTTR. The project is in the design phase thus they do not have working system’s available to measure actual availability.

How would you go about improving on this approach? [Read more…]

Filed Under: Articles, NoMTBF Tagged With: Availability

by nomtbf Leave a Comment

How to Calculate MTBF

How to Calculate MTBF

Considerations When You Calculate MTBF

You should calculate MTBF for machines tooIt is deceptively easy to calculate MTBF given a count of failure and an estimate of operating hours. Just tally up the total hours the various systems operate and divide by the number of failures. Easy.

This simple calculation is the unbiased estimator for the inverse of the parameter lambda for the exponential distribution, or directly to estimate theta (MTBF). We use theta to represent the 1 / lambda.

What could go wrong with such a simple calculation?

What is a failure?

Let’s start with what we count or do not count as a failure. This directly changes the resulting MTBF value. If we only count confirmed hardware failures, and do not count intermittent or unreproducible or software failures, are we under counting what the customer experiences as a failure?

Over what duration do we count the failures? Should we focus only on the first month of operation, the first year, the warranty or service contract period or the entire operating life of the system? How do you calculate MTBF?

Some organizations only count failures they expect to occur. The unexpected ones are ‘special’ causes and require further study before counting as failure officially.

Another organization only counted failures that completely shut down the system. A partial loss of functionality, a degradation of capability or the failure of a redundant element all did not count a system failure.

In my opinion if the customer calls it a failure, it’s a failure. If a failure, by any definition, costs your organization time and money to address, acknowledge, resolve or repair, it’s a failure.

What is operating time?

This one is tricky. If the system does include the appropriate sensors and tracking mechanisms (hour meter) and a way to gather that operating time of units both failed and still operating, then we have a pretty good way to track total operating hours. Some situations and systems make this easy.

Most do not.

Let’s say we ship 100 systems a month for 10 months. At the end of ten months the first shipments have accumulated 10 months of operating time. IF….

… They are all placed into service immediately

… They are all operated full time for the full 10 months

… They are have each failure reported including down time

In general, we do have to make a few assumptions to determine the operating time for shipped systems. We tend to be conservative and err on the side that would make the MTBF value a little smaller than if we had the full set of carefully tracked data. Or do we?

  • Some organization count from date/time of shipment ignoring shipping and installation time.
  • Some organization assume all systems are installed and operated 24/7.
  • Some organization assume no news is good news and the systems with no information are still operating.

And a few organization assume systems run indefinitely, even systems 20 years old, unless notified that it is decommissioned, assume it is still running full tilt. i.e. No retirement or replacement policy.

How about when you calculate MTBF?

By convention when there are no failures we assume in the next instant there will be one failure. This avoid dividing by zero which causes fits for calculators and spreadsheets and mathematicians.

Another issue is how often are the calculations made? Do we gather data hourly, daily, weekly, monthly, annually? Some use a rolling set of data, for example only units shipped in the last year count for both operating time and failures. This result will ignore or discount the longer term wear out failures as the bulk of the units are young.

Some organization do the calculations weekly in order to detect trends. If there are trends you probably should not be using MTBF…. If it’s changing, if there are early life or wear out failure mechanisms, you should not be using MTBF.

Even though you can calculate MTBF easily, the complexities of getting it right still do not provide a useful metric. Instead focus on getting better data including time to failure information so you can explore and report the data with other tools and methods. Treat the data appropriately and make better decisions

Sure, better data will improve the ability to calculate the MTBF value, if you’d like to be like some organizations, that is fine.

How have you seen MTBF calculated poorly? Share your thoughts and stories in the comments below.

Related:

How to calculate MTTF

Perils of using MTBF

Filed Under: Articles, NoMTBF

by nomtbf Leave a Comment

Dare to Know podcast interview with Fred Schenkelberg

Dare to Know podcast interview with Fred Schenkelberg

Fred talks about the NoMTBF blog and movement

Did you know I was interviewed for the Dare to Know podcast? The interview was fun, check it out here.

The Dare to Know podcast Interview

Tim Rodgers interviews Fred Schenkelberg concerning his blog, No MTBF and his mission to eradicate the common mis-use of MTBF.

Fred Schenkelberg image
Fred Schenkelberg

Fred Schenkelberg is a reliability engineering and management consultant at FMS Reliability. He’s a lecturer at the University of Maryland, and he’s been an active contributor to both the IEEE and the ASQ Reliability Divisions.

Fred re-established Hewlett Packard’s corporate reliability program in the late 1990s, and also worked as a reliability consultant at Microsoft and a manufacturing engineer at Raychem.

In this episode, Fred Schenkelberg discusses:

  • Why MTBF is a poor reliability metric
  • Common objections to eliminating MTBF
  • Alternatives to MTBF

 

Filed Under: Articles, NoMTBF

  • « Previous Page
  • 1
  • 2
  • 3
  • 4
  • 5
  • …
  • 12
  • Next Page »

[popup type="" link_text="Get Weekly Email Updates" link_class="button" ]

[/popup]

The Accendo Reliablity logo of a sun face in circuit

Please login to have full access.




Lost Password? Click here to have it emailed to you.

Not already a member? It's free and takes only a moment to create an account with your email only.

Join

Your membership brings you all these free resources:

  • Live, monthly reliability webinars & recordings
  • eBooks: Finding Value and Reliability Maturity
  • How To articles & insights
  • Podcasts & additional information within podcast show notes
  • Podcast suggestion box to send us a question or topic for a future episode
  • Course (some with a fee)
  • Largest reliability events calendar
  • Course on a range of topics - coming soon
  • Master reliability classes - coming soon
  • Basic tutorial articles - coming soon
  • With more in the works just for members
Speaking of Reliability podcast logo

Subscribe and enjoy every episode

RSS
iTunes
Stitcher

Join Accendo

Receive information and updates about podcasts and many other resources offered by Accendo Reliability by becoming a member.

It’s free and only takes a minute.

Join Today

Dare to Know podcast logo

Subscribe and enjoy every episode

RSS
iTunes
Stitcher

Join Accendo

Receive information and updates about podcasts and many other resources offered by Accendo Reliability by becoming a member.

It’s free and only takes a minute.

Join Today

Accendo Reliability Webinar Series podcast logo

Subscribe and enjoy every episode

RSS
iTunes
Stitcher

Join Accendo

Receive information and updates about podcasts and many other resources offered by Accendo Reliability by becoming a member.

It’s free and only takes a minute.

Join Today

Recent Articles

  • test
  • test
  • test
  • Your Most Important Business Equation
  • Your Suppliers Can Be a Risk to Your Project

© 2025 FMS Reliability · Privacy Policy · Terms of Service · Cookies Policy