Accendo Reliability

Your Reliability Engineering Professional Development Site

  • Home
  • About
    • Contributors
  • Reliability.fm
    • Speaking Of Reliability
    • Rooted in Reliability: The Plant Performance Podcast
    • Quality during Design
    • Way of the Quality Warrior
    • Critical Talks
    • Dare to Know
    • Maintenance Disrupted
    • Metal Conversations
    • The Leadership Connection
    • Practical Reliability Podcast
    • Reliability Matters
    • Reliability it Matters
    • Maintenance Mavericks Podcast
    • Women in Maintenance
    • Accendo Reliability Webinar Series
  • Articles
    • CRE Preparation Notes
    • on Leadership & Career
      • Advanced Engineering Culture
      • Engineering Leadership
      • Managing in the 2000s
      • Product Development and Process Improvement
    • on Maintenance Reliability
      • Aasan Asset Management
      • AI & Predictive Maintenance
      • Asset Management in the Mining Industry
      • CMMS and Reliability
      • Conscious Asset
      • EAM & CMMS
      • Everyday RCM
      • History of Maintenance Management
      • Life Cycle Asset Management
      • Maintenance and Reliability
      • Maintenance Management
      • Plant Maintenance
      • Process Plant Reliability Engineering
      • ReliabilityXperience
      • RCM Blitz®
      • Rob’s Reliability Project
      • The Intelligent Transformer Blog
      • The People Side of Maintenance
      • The Reliability Mindset
    • on Product Reliability
      • Accelerated Reliability
      • Achieving the Benefits of Reliability
      • Apex Ridge
      • Metals Engineering and Product Reliability
      • Musings on Reliability and Maintenance Topics
      • Product Validation
      • Reliability Engineering Insights
      • Reliability in Emerging Technology
    • on Risk & Safety
      • CERM® Risk Insights
      • Equipment Risk and Reliability in Downhole Applications
      • Operational Risk Process Safety
    • on Systems Thinking
      • Communicating with FINESSE
      • The RCA
    • on Tools & Techniques
      • Big Data & Analytics
      • Experimental Design for NPD
      • Innovative Thinking in Reliability and Durability
      • Inside and Beyond HALT
      • Inside FMEA
      • Integral Concepts
      • Learning from Failures
      • Progress in Field Reliability?
      • R for Engineering
      • Reliability Engineering Using Python
      • Reliability Reflections
      • Testing 1 2 3
      • The Manufacturing Academy
  • eBooks
  • Resources
    • Accendo Authors
    • FMEA Resources
    • Feed Forward Publications
    • Openings
    • Books
    • Webinars
    • Journals
    • Higher Education
    • Podcasts
  • Courses
    • 14 Ways to Acquire Reliability Engineering Knowledge
    • Reliability Analysis Methods online course
    • Measurement System Assessment
    • SPC-Process Capability Course
    • Design of Experiments
    • Foundations of RCM online course
    • Quality during Design Journey
    • Reliability Engineering Statistics
    • Quality Engineering Statistics
    • An Introduction to Reliability Engineering
    • Reliability Engineering for Heavy Industry
    • An Introduction to Quality Engineering
    • Process Capability Analysis course
    • Root Cause Analysis and the 8D Corrective Action Process course
    • Return on Investment online course
    • CRE Preparation Online Course
    • Quondam Courses
  • Webinars
    • Upcoming Live Events
  • Calendar
    • Call for Papers Listing
    • Upcoming Webinars
    • Webinar Calendar
  • Login
    • Member Home

All articles listed in reverse chronological order.

by Fred Schenkelberg 1 Comment

Meditation and Design for Reliability

Meditation and Design for Reliability

Is it possible for an individual to ‘do’ DFR? Is design for reliability something, like a specific technique, that is DFR?

What is DFR and how would you recognize it if it was occurring? Like meditation, nearly anyone can strike a pose that appears similar to someone in deep meditation, yet can you tell by observation if they really are mediating? Probably not. The same is true for an organization or person that declares they are doing DFR. Maybe they are or maybe not.

[Read more…]

Filed Under: Articles, Musings on Reliability and Maintenance Topics, on Product Reliability Tagged With: design

by nomtbf Leave a Comment

When to Use MTBF as a Metric?

When to Use MTBF as a Metric?

When to Use MTBF as a Metric?

Sean Bonner Old Reliable Coffee
Sean Bonner
Old Reliable Coffee

I will not say ‘never’, which is probably what you expect. There are a rare set of circumstances which may benefit with the use of MTBF as a metric. Of course, this does not include being deceitful or misleading with marketing materials. There may actually be an occasion where the MTBF metric works well.

As you know, MTBF is often estimated by tallying up the total hours of operation of a set of devices or systems and dividing by the number of failures. If no failures occur we assume one failure to avoid dividing by zero (messy business dividing by zero and to be avoided). MTBF is essentially the average time to failure.

Expected Value as Metric

The metric we select should be measurable and of a measure we have an interest. We would like to detect changes, measure progress, and possibly make business decisions with our metrics. If we are interested in the expected value of the time to failure for our devices, then MTBF might just be useful.

When making a device we often hear of executives, engineers and customers talk about how long they expect the product to last. An office device may have an expected life of 5 years, a solar power system – 30 years, and so on. If by duration we all agree that we expect 5 years of service on average, then using the average as the metric makes sense.

Before starting the use of MTBF, just make sure that a 5 year life implies half or two thirds of the devices will fail by the stated duration of 5 years. Yes, if the time to failure distribution is actually described by the exponential distribution (and a few other distributions) it means that two third of the units are expected to failure by the MTBF value. Thus if we set the goal to 5 years MTBF we imply half or more of the units will fail by 5 years.

Product Testing Advantages

Having a goal helps the design and development team make decisions and eventually conduct testing to prove the design meets the reliability objectives. Setting the goal a the expected value allows the fewest number of samples for testing. Testing for 99% reliability over 5 years is much tougher. We may require many samples to determine a meaningful estimate of the leading tail (i.e. first 1% or 5% of failures) of the time to failure distribution.

If the time failure pattern fits an exponential distribution, then testing becomes simplified. We can test one unit for a long time, or many units a short time, and arrive at the same answer. The test planning can maximize our resources to efficiently prove our design meets the objective. When the chance of failure each hour is the same, every device-hour of testing provide an equal amount of information.

Unlike products that wear out or degrade with time, when the design and device exhibit an exponential distribution we do not need any aging studies. We can just apply use or accelerated stress and measure the hours of operation and count the failures. Also any early failures are obviously quality issues and most likely do not count toward failures that represent actual field failures. Or do they?

Metrics Should Have a Common Understanding

When the industry, organization, vendors, and engineering staff already use MTBF to discuss reliability, then management would be wise to establish a metric using MTBF. Makes sense, right? The formula to calculate MTBF is very simple. Even the name implies the meaning (no pun intended). MTBF is the mean time between (or before) failure. It’s an average, which calculators, spreadsheets, smart phones, and possibly even your watch can calculate.

While the spread of the data is often of importance when making comparisons, estimating a sample set of data’s confidence bounds, or estimating the number of failures over the warranty period, if we assume the data actually fits an exponential distribution, we find the mean equals the standard deviation. Great! One less calculation. We have what we need to move forward.

Nearly every reliability or quality textbook or guideline includes extensive discussions about MTBF the exponential distribution and a wide range of reliability related calculations. Our common understanding generally is supported by the plentiful references.

Ask a few folks around you when considering using MTBF. What do they define MTBF as representing? If you receive a consistent answer, you may just have a common understanding. If the understanding is also aligned with the underlying math and assumptions, even better.

When to Use MTBF Checklist

In summary all you need is:

  • A business interest in the time till half or more of product fail
  • A design with a fixed chance to failure each hour of operation
  • A well educated team that understands the proper use of an inverse failure rate measure

I submit we are rarely interested in the time till the bulk of devices fail, rather interested in the time to first failures or some small percentage fail

I suggest that very few devices or system actually fail with a constant hazard rate. If your product does, prove it without grand waves of assumptions.

I have found that engineers, scientists, vendors, customers, and manager regularly misunderstand MTBF and how to properly use an MTBF value.

So back to the opening statement, it is possible though not likely you will find an occasion to effectively use MTBF as a metric. Instead use reliability: the probability of successful operation over a stated period with stated conditions and definition of success. 98% of office printers will function for 5 years without failure in a office…. Pretty clear. Sure we can fully define the function(s) and environment, and we need to do that anyway.

Filed Under: Articles, NoMTBF

by nomtbf Leave a Comment

Reliability and Availability

Reliability and Availability

Reliability and Availability

Brent Moore The Old Reliable Bull Durham https://www.flickr.com/photos/brent_nashville/2163154869/in/gallery-fms95032-72157649635411636/
Brent Moore
The Old Reliable Bull Durham

In English there is a lot of confusion on what reliability, availability and other ‘ilities mean in a technical way. Reliability as used in advertising and common discussions often means dependable or trustworthy. If talking about a product or system it may mean it will work as expected. [Read more…]

Filed Under: Articles, NoMTBF

by Fred Schenkelberg 5 Comments

The Law of Large Numbers and the Gambler’s Fallacy

The Law of Large Numbers and the Gambler’s Fallacy

edited by John Healy

This theorem is a fundamental element of probability theory. The law is basically that if one conducts the same experiment a large number of times the average of the results should be close to the expected value. Furthermore, the more trails conducted the closer the resulting average will be to the expected value.

[Read more…]

Filed Under: Articles, CRE Preparation Notes, Probability and Statistics for Reliability Tagged With: Basic Probability Concepts

by Fred Schenkelberg 2 Comments

Is Environmental Testing Part of Product Reliability?

Is Environmental Testing Part of Product Reliability?

Environmental testing is the evaluation of a product or system in one or more stress conditions. Environmental as in that which surrounds and affects a product. Consider temperature. Is the product going to experience outdoor temperatures as found in Fargo, North Dakota or Belmopan, Belize?

The weather is one way to describe external stresses, yet it is so much more. Environmental testing may include fungus, insect, and animal exposure. The document MIL-STD-810G lists and describes testing methods for a wide range of environmental conditions. [Read more…]

Filed Under: Articles, Musings on Reliability and Maintenance Topics, on Product Reliability Tagged With: testing

by nomtbf Leave a Comment

Looking Forward to the MTBF Report

Looking Forward to the MTBF Report

photolibrarian Matchbook, Jack Knarr, Reliable Cleaners, West Union, Iowa https://www.flickr.com/photos/photolibrarian/8127780278/in/gallery-fms95032-72157649635411636/
photolibrarian
Matchbook, Jack Knarr, Reliable Cleaners,West Union, Iowa

On social media the other day ran across a comment from someone that took my breath away. They were looking forward to starting a new reliability, no, MTBF report. They were tasked with creating a measure of reliability for use across the company and they choose MTBF.

Sigh.

Where have we gone wrong?

I certainly do not blame the person. They have read about MTBF in many textbooks. Studied reliability using MTBF and related measures, plus found technical papers using the same. They may have seen industry reports and standards also.

MTBF is prevalent and no wonder someone tasked with setting a metric would select MTBF. It’s easy to calculate. Just one number and bigger is better.

On the other hand

MTBF is roundly criticized across any reliability related forum or discussion group. There is progress in books, papers and standards. And, it’s not reaching those new to reliability engineering.

This note will be short and have one request. Please tell those just getting started in reliability engineering to please not consider using MTBF. To not request MTBF from vendors. And, to actually do some thinking before selecting MTBF as their organizations metric.

Better yet, challenge those using MTBF to explain in a coherent and rational manner why they are doing so. Ask them to validate their assumed constant failure rate or similar assumptions. Working together we can start a ripple that may help build the wave of knowledge to improve the state of reliability engineering.

Filed Under: Articles, NoMTBF

by Fred Schenkelberg Leave a Comment

Reading a Standard Normal Table

Reading a Standard Normal Table

Editing and Contributions by John Healy

Before computers and statistical software, we relied on tables to determine values for common integration problems – the normal distribution in particular. There is no closed form solution for the integral of the normal distribution probability density function, it requires advanced numerical methods to estimate the area under the curve. [Read more…]

Filed Under: Articles, CRE Preparation Notes, Probability and Statistics for Reliability Tagged With: Discrete and continuous probability distributions

by Fred Schenkelberg 5 Comments

When to Stop Testing

When to Stop Testing

Stop testing when the testing provides no value.

If no one is going to review the results or use the information to make a decision, those are good signs that the testing provides no value. Of course, this may be difficult to recognize.

Some time ago while working with a product development team, one of the tasks assigned was to create an ongoing reliability test plan. This was just prior to the final milestone before starting production. During development, we learned quite a bit about the product design, supply chain, and manufacturing process. Each of which included a few salient risks to reliable performance.

 

[Read more…]

Filed Under: Articles, Musings on Reliability and Maintenance Topics, on Product Reliability Tagged With: testing

by nomtbf Leave a Comment

Why Doesn’t Product Testing Catch Everything?

Why Doesn’t Product Testing Catch Everything?

Why Doesn’t Product Testing Catch Everything?

photolibrarian West Union, Iowa, The Reliable Agency, B. Kamm, Jr., Matchbook, Farmers Casualty Company https://www.flickr.com/photos/photolibrarian/8244857538/in/gallery-fms95032-72157649635411636/
photolibrarian
West Union, Iowa, The Reliable Agency, B. Kamm, Jr., Matchbook, Farmers Casualty Company

In an ideal world the design of a product or system will have perfect knowledge of all the risks and failure mechanisms. The design then is built perfectly without any errors or unexpected variation and will simply function as expected for the customer.

Wouldn’t that be nice.

The assumption that we have perfect knowledge is the kicker though, along with perfect manufacturing and materials. We often do not know enough about:

  • Customer requirements
  • Operating environment
  • Frequency of use
  • Impact of design tradeoffs
  • Material variability
  • Process variability

We do know that we do not know everything we need to create a perfect product, thus we conduct experiments.

We test. [Read more…]

Filed Under: Articles, NoMTBF

by Fred Schenkelberg Leave a Comment

Central Limit Theorem

Central Limit Theorem

There are two basic ways to consider the central limit theorem. First consider a random variable, X, which has a mean, μ, and variance σ2. If we take a random sample from f(X) of size n and calculate the sample mean, X̄, then as n increases the distribution of the sample means, X̄’s approaches a normal distribution with mean, μ, and variance σ2/√n̄. The original data, X, may have any distribution and when n is suitably large the distribution of the averages will approach a normal distribution. [Read more…]

Filed Under: Articles, CRE Preparation Notes, Probability and Statistics for Reliability Tagged With: Statistical Terms

by Fred Schenkelberg 2 Comments

Reliability and Monte Carlo Determined Tolerances

Reliability and Monte Carlo Determined Tolerances

In the Monte Carlo method, one uses the idea that not all parts have the same dimensions, yet a normal distribution describing the variation of the part dimensions is not assumed.

Although the normal distribution does commonly apply, if the process includes sorting or regular adjustments or if the distribution is either clipped or skewed then the normal distribution may not be the best way to summarize the data.

As with any tolerance setting, getting it right is key for the proper functioning of a product. Monte Carlo method allows you to consider and use the appropriate models for the variations that will exist across your components. [Read more…]

Filed Under: Articles, Musings on Reliability and Maintenance Topics, on Product Reliability Tagged With: tolerance analysis

by nomtbf Leave a Comment

Fixing Early Life Failures Can Make Your MTBF Worse

Fixing Early Life Failures Can Make Your MTBF Worse

Fixing Early Life Failures Can Make Your MTBF Worse

 

change in MTBFLet’s say we 6 months of life data on 100 units. We’re charged with looking at the data and determine the impact of fixing the problems that caused the earliest failures.

The initial look of the data includes 9 failures and 91 suspensions. Other then the nine all units operated for 180 days. The MTBF is about 24k days. Having heard about Weibull plotting and using the beta value as a guide initially find the blue line in the plot. The beta value is less than one so we start looking for supply chain, manufacturing or installation caused failures, as we suspect early life failures dominant the time to failure pattern.

Initial Steps to Improve the Product

Given clues and evidence that some of the products failed early we investigate and find evidence of damage to units during installation. In fact it appears the first four failures were due to installation damage. The fix will cost some money, so the director of engineer asks for an estimate of the effect of the change on the reliability of the system.

The organization uses MTBF as does the customer. The existing MTBF of 24k days exceeds the customers requirement of 10K days, yet avoiding early problems may be worth the customer good will. The motivation is driven by continuous improvement and not out of necessity or customer complaints.

Calculation of Impact of Change on Reliability

One way to estimate the effect of a removal of a failure mechanism is to examine the data without counting the removed failure mechanism. So, if the change to the installation practice in the best case completely prevents the initial four failures observed we are left with just the 5 other failures that occurred over the 6 months.

Removing the four initial failures and calculating MTBF we estimate MTBF will change to about 300 days.

Hum?

We removed failures and the MTBF got worse?

What Could Cause this Kind of Change?

The classic calculation for MTBF is the total time divided by the number of failures. Taking a closer look at time to failure behavior of the two different failure mechanisms may reveal what is happening. The early failures have a decreasing failure rate (Weibull beta parameter less than 1) over the first two months of operation. Later, in the last couple of months of operation, 5 failures occur and they appear to have an increasing rate of failure (Weibull beta parameter greater than 1).

By removing the four early failures the Weibull distribution fit changes from the blue line to the black line (steeper slope).

Recall that the MTBF value represents the point in time when about 63% of units have failed. With only 9 total failures out of 100 units we have only about 10% of units failed so the MTBF calculation is a projects to the future when most of have failed, it does not providing information about failures at 6 months or less directly.

In this case when the four early failures are removed the slope changed from about 0.7 to about 5, it rotated counter clockwise on the CDF plot.

If only using MTBF the results of removing four failures from the data made the measured MTBF much worse and would have prevented us from improving the product. By fitting the data to a Weibull distribution we learned to investigate early life failures, plus once that failure mechanism was removed revealed a potentially serious wear out failure mechanism.

This is an artificial example, of course, yet it illustrates the degree which an organization is blind to what is actually occurring by using only MTBF. Treat the data well and use multiple methods to understand the time to failure pattern.

Filed Under: Articles, NoMTBF

by Fred Schenkelberg 5 Comments

The Normal Distribution

The Normal Distribution

A continuous distribution is useful in many statistical applications such as process capability, control charts, and confidence intervals about point estimates. On occasion time to failure, data may exhibit behavior that a normal distribution models well. [Read more…]

Filed Under: Articles, CRE Preparation Notes, Probability and Statistics for Reliability Tagged With: Discrete and continuous probability distributions

by Fred Schenkelberg 2 Comments

Reliability and Root Sum Squared Tolerances

Reliability and Root Sum Squared Tolerances

The root sum squared (RSS) method is a statistical tolerance analysis method. In many cases, the actual individual part dimensions fall near the center of the tolerance range with very few parts with actual dimensions near the tolerance limits.

This, of course, assumes the part dimensions are tightly grouped and within the tolerance range.

Setting tolerances well, using the best available data about the part(s) variation, allows creating designs that function well given the expected part variation. This is better for reliable performance. Also, the same method can be applied when the loads and stresses are normally distributed.

Check that assumption with you data first, of course. [Read more…]

Filed Under: Articles, Musings on Reliability and Maintenance Topics, on Product Reliability Tagged With: tolerance analysis

by Fred Schenkelberg 5 Comments

Upcoming live online CRE Prep course

Learn more and register for the CRE Prep course here. (registration is closed at this time)

Overview

  • This online class provides an overview of Body Of Knowledge with guided self-study by the student, review of previous CRE exam questions, hands on-line approach.
  • This virtual classroom allows participants to actively participate by drawing on the virtual whiteboard, asking questions on the mic or web chat, taking polls, sharing desktops, and webcam capability.  Imagine following a lecture, highlighting or circling your problem areas on the virtual whiteboard, or even presenting tough CRE questions to be answered.  You can even e-mail questions to be answered in the next class!
  • If you are unable to make the live class or want to re-view videos from past sessions, you may do so by clicking any course date in the Course Archive.
  • This course is designed to supplement the knowledge of the individual having met the requirements for certification and is not designed to teach the entire body of knowledge.

The course is 16 two hour sessions – a mix of lecture, discussion and Q&A that focused on what you need to know to pass the ASQ CRE exam. Sure, beyond the course you will need to study. This involved practicing sample exam questions, using your references and calculator, and practicing finding answers quickly.

We discuss ways to study along with any issues or ideas you have as you prepare for the exam.

*Note: These online classes will be Archived and you will be able to access them later even if you cannot make it to the Live class session!  This means you may watch them again in case you forget something, miss a class, or can’t make it one day.  Saturdays may be scheduled with the instructor for extra sessions or make-up classes if needed.  

Mondays, July 13 – August 17, 2015 (3:00PM-5:00PM PDT)

Last Session for questions, September 21 and 28, 2015 (3:00PM-5:00PM PDT) 

This class ends prior to the semi-annual ASQ Section testing, where everyone may sign-up through their local Section to take their test.  Be sure to check out the ASQ website to get the early-bird rate before the end of this course!

Looking forward to seeing you in the course. And, of course, at any time, feel free to ask questions here, or in the CRE Preparation LinkedIn group.

Learn more and register for the CRE Prep course here.

Filed Under: Articles, CRE Prep, CRE Preparation Notes Tagged With: cre prep

  • « Previous Page
  • 1
  • …
  • 189
  • 190
  • 191
  • 192
  • 193
  • …
  • 215
  • Next Page »

Join Accendo

Receive information and updates about articles and many other resources offered by Accendo Reliability by becoming a member.

It’s free and only takes a minute.

Join Today

Recent Articles

  • test
  • test
  • test
  • Your Most Important Business Equation
  • Your Suppliers Can Be a Risk to Your Project

© 2025 FMS Reliability · Privacy Policy · Terms of Service · Cookies Policy