Accendo Reliability

Your Reliability Engineering Professional Development Site

  • Home
  • About
    • Contributors
  • Reliability.fm
    • Speaking Of Reliability
    • Rooted in Reliability: The Plant Performance Podcast
    • Quality during Design
    • Way of the Quality Warrior
    • Critical Talks
    • Dare to Know
    • Maintenance Disrupted
    • Metal Conversations
    • The Leadership Connection
    • Practical Reliability Podcast
    • Reliability Matters
    • Reliability it Matters
    • Maintenance Mavericks Podcast
    • Women in Maintenance
    • Accendo Reliability Webinar Series
  • Articles
    • CRE Preparation Notes
    • on Leadership & Career
      • Advanced Engineering Culture
      • Engineering Leadership
      • Managing in the 2000s
      • Product Development and Process Improvement
    • on Maintenance Reliability
      • Aasan Asset Management
      • AI & Predictive Maintenance
      • Asset Management in the Mining Industry
      • CMMS and Reliability
      • Conscious Asset
      • EAM & CMMS
      • Everyday RCM
      • History of Maintenance Management
      • Life Cycle Asset Management
      • Maintenance and Reliability
      • Maintenance Management
      • Plant Maintenance
      • Process Plant Reliability Engineering
      • ReliabilityXperience
      • RCM Blitz®
      • Rob’s Reliability Project
      • The Intelligent Transformer Blog
      • The People Side of Maintenance
      • The Reliability Mindset
    • on Product Reliability
      • Accelerated Reliability
      • Achieving the Benefits of Reliability
      • Apex Ridge
      • Metals Engineering and Product Reliability
      • Musings on Reliability and Maintenance Topics
      • Product Validation
      • Reliability Engineering Insights
      • Reliability in Emerging Technology
    • on Risk & Safety
      • CERM® Risk Insights
      • Equipment Risk and Reliability in Downhole Applications
      • Operational Risk Process Safety
    • on Systems Thinking
      • Communicating with FINESSE
      • The RCA
    • on Tools & Techniques
      • Big Data & Analytics
      • Experimental Design for NPD
      • Innovative Thinking in Reliability and Durability
      • Inside and Beyond HALT
      • Inside FMEA
      • Integral Concepts
      • Learning from Failures
      • Progress in Field Reliability?
      • R for Engineering
      • Reliability Engineering Using Python
      • Reliability Reflections
      • Testing 1 2 3
      • The Manufacturing Academy
  • eBooks
  • Resources
    • Accendo Authors
    • FMEA Resources
    • Feed Forward Publications
    • Openings
    • Books
    • Webinars
    • Journals
    • Higher Education
    • Podcasts
  • Courses
    • 14 Ways to Acquire Reliability Engineering Knowledge
    • Reliability Analysis Methods online course
    • Measurement System Assessment
    • SPC-Process Capability Course
    • Design of Experiments
    • Foundations of RCM online course
    • Quality during Design Journey
    • Reliability Engineering Statistics
    • Quality Engineering Statistics
    • An Introduction to Reliability Engineering
    • Reliability Engineering for Heavy Industry
    • An Introduction to Quality Engineering
    • Process Capability Analysis course
    • Root Cause Analysis and the 8D Corrective Action Process course
    • Return on Investment online course
    • CRE Preparation Online Course
    • Quondam Courses
  • Webinars
    • Upcoming Live Events
  • Calendar
    • Call for Papers Listing
    • Upcoming Webinars
    • Webinar Calendar
  • Login
    • Member Home

All articles listed in reverse chronological order.

by Tim Rodgers Leave a Comment

Are Your Suppliers Really Committed to Quality?

Are Your Suppliers Really Committed to Quality?

Suppliers always declare their commitment to the highest standards of quality as a core value, but many have trouble living up to that promise.

I can’t tell you how many times I’ve visited suppliers who proudly display their framed ISO certificates in the lobby yet suffer from persistent quality problems that lead to higher cost and schedule delays.

Here’s how you can tell if they’re really serious:

[Read more…]

Filed Under: Articles, Managing in the 2000s, on Leadership & Career

by nomtbf Leave a Comment

Are the Measures Failure Rate and Probability of Failure Different?

Are the Measures Failure Rate and Probability of Failure Different?

Old machinery enjoyed a failure rate, which one though?Are the Measures Failure Rate and Probability of Failure Different?

Failure rate and probability are similar. They are slightly different, too.

One of the problems with reliability engineering is so many terms and concepts are not commonly understood.

Reliability, for example, is commonly defined as dependable, trustworthy, as in you can count on him to bring the bagels. Whereas, reliability engineers define reliability as the probability of successful operation/function within in a specific environment over a defined duration.

The same for failure rate and probability of failure. We often have specific data-driven or business-related goals behind the terms. Others do not.
If we do not state over which time period either term applies, that is left to the imagination of the listener. Which is rarely good.

Failure Rate Definition

There at least two failure rates that we may encounter: the instantaneous failure rate and the average failure rate. The trouble starts when you ask for and are asked about an item’s failure rate. Which failure rate are you both talking about?

The instantaneous failure rate is also known as the hazard rate h(t)

$latex \displaystyle&s=3 h\left( t \right)=\frac{f\left( t \right)}{R\left( t \right)}$

Where f(t) is the probability density function and R(t) is the relaibilit function with is one minus the cumulative distribution function. The hazard rate, failure rate, or instantaneous failure rate is the failures per unit time when the time interval is very small at some point in time, t. Thus, if a unit is operating for a year, this calculation would provide the chance of failure in the next instant of time.

This is not useful for the calculation of the number of failures over that year, only the chance of a failure in the next moment.

The probability density function provides the fraction failure over an interval of time. As with a count of failures per month, a histogram of the count of failure per month would roughly describe a PDF, or f(t). The curve described for each point in time traces the value of the individual points in time instantaneous failure rate.

Sometimes, we are interested in the average failure rate, AFR. Where the AFR over a time interval, t1 to t2, is found by integrating the instantaneous failure rate over the interval and divide by t2 – t1. When we set t1 to 0, we have

$latex \displaystyle&s=3 AFR\left( T \right)=\frac{H\left( T \right)}{T}=\frac{-\ln R\left( T \right)}{T}$

Where H(T) is the integral of the hazard rate, h(t) from time zero to time T,
T is the time of interest which define a time period from zero to T,
And, R(T) is the reliability function or probability of successful operation from time zero to T.

A very common understanding of the rate of failure is the calculation of the count of failures over some time period divided by the number of hours of operation. This results in the fraction expected to fail on average per hour. I’m not sure which definition of failure rate above this fits, and yet find this is how most think of failure rate.

If we have 1,000 resistors that each operate for 1,000 hours, and then a failure occurs, we have 1 / (1,000 x 1,000 ) = 0.000001 failures per hour.

Let’s save the discussion about the many ways to report failure rates, AFR (two methods, at least), FIT, PPM/K, etc.

Probability of Failure Definition

I thought the definition of failure rate would be straightforward until I went looking for a definition. It is with trepidation that I start this section on the probability of failure definition.

To my surprise it is actually rather simple, the common definition both in common use and mathematically are the same. There are two equivalent ways to phrase the definition:

  1. The probability or chance that a unit drawn at random from the population will fail by time t.
  2. The proportion or fraction of all units in the population that fail by time t.

We can talk about individual items or all of them concerning the probability of failure. If we have a 1 in 100 chance of failure over a year, then that means we have about a 1% chance that the unit we’re using will fail before the end of the year. Or it means if we have 100 units placed into operation, we would expect one of them to fail by the end of the year.

The probability of failure for a segment of time is defined by the cumulative distribution function or CDF.

When to Use Failure Rate or Probability of Failure

This depends on the situation. Are you talking about the chance to failure in the next instant or the chance of failing over a time interval? Use failure rate for the former, and probability of failure for the latter.

In either case, be clear with your audience which definition (and assumptions) you are using. If you know of other failure rate or probability of failure definition, or if you know of a great way to keep all these definitions clearly sorted, please leave a comment below.

Filed Under: Articles, NoMTBF Tagged With: Failure, Failure Rate

by James Reyes-Picknell Leave a Comment

Rapid Proactive Maintenance Program – PM Program – Part 3

Rapid Proactive Maintenance Program – PM Program – Part 3

In the first installment of this series, we described the basics behind proactive maintenance and some of the considerations users need to make.

The second installment described RCM programs – the “gold standard” if you like for program development. This third installment describes what you can do if you realize you need a program but have nothing. It would also work if you’ve got a PM program but you are unhappy with the results you are getting. Chances are that something is missing or not being done often enough.

We’ve often encountered maintenance programs that are lacking. They need a stronger proactive component and they need it quickly to get things under control.

This guideline is intended to help to get things under control. [Read more…]

Filed Under: Articles, Conscious Asset, on Maintenance Reliability

by Mike Sondalini Leave a Comment

Shaft Sealing with a Packed Gland

Shaft Sealing with a Packed Gland

What you will learn from this article:

  • How shaft packing works.
  • What to consider when selecting and using shaft packing.
  • Good installation practices.
  • Proper commissioning of shaft packing.

[Read more…]

Filed Under: Articles, on Maintenance Reliability, Plant Maintenance

by Greg Hutchins Leave a Comment

US Federal Enterprise Risk Management Requirements

US Federal Enterprise Risk Management Requirements

Guest Post by Greg Hutchins (first posted on CERM ® RISK INSIGHTS – reposted here with permission)

Last year, we reported that White House Office of Management and Budget (OMB – executive office) is requiring US departments to design and implement Enterprise Risk Management (ERM).  The requirements are part of the OMB Circular A 11 Section 270 – Performance and Strategic Reviews.

US Departments are:

expected to manage risks and challenges related to delivering the organization’s mission. ERM is a strategic discipline that can help agencies to properly identify and manage risks to performance, especially those risks related to achieving strategic objectives.

[Read more…]

Filed Under: Articles, CERM® Risk Insights, on Risk & Safety Tagged With: risk management

by Fred Schenkelberg 2 Comments

Establishing Part Specific Reliability Specifications

Establishing Part Specific Reliability Specifications

Unless you are working with raw materials directly, you rely on your suppliers to provide reliable parts.

Do you suppliers know your reliability objectives for the parts they supply?

If you didn’t tell them, they probably do not know. If you did tell them, did you make it the reliability specification clear and understandable?

As with any specification, clear communication is essential. Guessing or assuming both parties know and have the same reliability goals is, well, not a good practice. The ability of a supplier to build and deliver the parts that meet all your specification has to include a clear and understandable reliability specification.

There is a range of common reliability specifications in use, some are better than others. Let’s start with a brief review of reliability specification types.

Then briefly outline how you establish the reliability specifications for each supplied component. [Read more…]

Filed Under: Articles, CRE Preparation Notes, Reliability in Design and Development Tagged With: Establishing specifications

by Fred Schenkelberg Leave a Comment

First Step in Analyzing Repairable Systems Data

First Step in Analyzing Repairable Systems Data

Using the right plot enables your team to know what is working or need improvement.

Part 4 of 7

Your facility has data and maybe too much data. Using simple plotting may be the key to unlocking how well your maintenance program is performing.

Building on the concept of reliability growth modeling James Kovacevic described a convenient way to quickly visualize your repairable system failure data is with a mean cumulative function (MCF) plot. [Read more…]

Filed Under: Articles, Maintenance and Reliability, on Maintenance Reliability Tagged With: data analysis

by Tim Rodgers Leave a Comment

Why Should a Supplier Work Harder For You?

Why Should a Supplier Work Harder For You?

A recent LinkedIn discussion addressed the question of the best strategy for dealing with poor supplier performance.

A lot of the respondents seemed to advocate a punitive approach, either threatening the loss of future business if performance doesn’t improve, or combing through the terms & conditions of the contract for enforcement language.

I’ve always thought that there’s a lot of similarity between managing suppliers and managing subordinates, and I wonder if some of these same people threaten their teams with punitive actions when individual performance doesn’t meet expectations. [Read more…]

Filed Under: Articles, Managing in the 2000s, on Leadership & Career Tagged With: supplier

by Fred Schenkelberg Leave a Comment

Improve Decision Making with Statistics

Improve Decision Making with Statistics

We make decisions all the time. Often our decision making is with little more than a gut feeling.

When faced with a major decision we often look data to help us decide. Is the product reliable enough as designed? Which field returns indicate we should stop production?

Some decision may help us earn or lose thousands if not millions of dollars.

Deciding to delay a product launch by six months means we have no revenue for the duration. The delay may also permit us to address a design flaw that would cause half the products to fail within a few months.

The later may cause loss of market share, erosion of brand loyalty, not to mention the cost of warranty claims. [Read more…]

Filed Under: Articles, Musings on Reliability and Maintenance Topics, on Product Reliability Tagged With: statistics

by nomtbf Leave a Comment

The Magic Math of Meeting MTBF Requirements

The Magic Math of Meeting MTBF Requirements

Even old machines met reliability or MTBF requirementsThe Magic Math of Meeting MTBF Requirements

Recently heard from a reader of NoMTBF. She wondered about a supplier’s argument that they meet the reliability or MTBF requirements. She was right to wonder.

Estimating reliability performance a new design is difficult.

There are good and better practice to justify claims about future reliability performance. Likewise, there are just plain poor approaches, too. Plus there are approaches that should never be used.

The Vendor Calculation to Support Claim They Meet Reliability Objective

Let’s say we contract with a vendor to create a navigation system for our vehicle. The specification includes functional requirements. Also it includes form factor and a long list of other requirements. It also clearly states the reliability specification. Let’s say the unit should last with 95% probability over 10 years of use within our vehicle. We provide environmental and function requirements in detail.

The vendor first converts the 95% probability of success over 10 years into MTBF. Claiming they are ‘more familiar’ with MTBF. The ignore the requirements for probability of first month of operation success. Likewise they ignore the 5 year targeted reliability, or as they would convert, MTBF requirements.

[Note: if you were tempted to calculate the equivalent MTBF, please don’t. It’s not useful, nor relevant, and a poor practice. Suffice it to say it would be a large and meaningless number]

RED FLAG By converting the requirement into MTBF it suggests they may be making simplifying assumptions. This may permit easier use of estimation, modeling, and testing approaches.

The Vendor’s Approach to ‘Prove’ The Meet the MTBF Requirement

The vendor reported they met the reliability requirement using the following logic:

Of the 1,000 (more actually) components we selected 6 at random for accelerated life testing. We estimated the lower 60% confidence of the probability of surviving 10 years given the ALT results. Then converted the ALT results to MTBF for the part.

We then added the Mil Hdbk 217 failure rate estimate to the ALT result for each of the 6 parts.

RED FLAG This one has me wondering the rationale for adding failure rates of an ALT and a parts count prediction. It would make the failure rate higher. Maybe it was a means to add a bit of margin to cover the uncertainty? I’m not sure, do you have any idea why someone would do this? Are they assuming the ALT did not actually measure anything relevant or any specific failure mechanisms, or they used a benign stress? ALT details were not provided.

The Approach Gets Weird Here

Then we use a 217 parts count prediction along with the modified 6 component failure rates to estimate the system failure rate, and with a simple inversion estimated the MTBF. They then claimed the system design will meet the field reliability performance requirements.

RED FLAG Mil HDBK 217 F in section 3.3 states

Hence, a reliability prediction should never be assumed to represent the expected field reliability …

If you are going to use a standard, any standard, one should read it. Read to  understand when and why it is useful or not useful.

What Should the Vendor Have Done Instead?

There are a lot of ways to create a new design and meet reliability requirements.

  • The build, test, fix approach or reliability growth approach works well in many circumstances.
  • Using similar actually fielded systems failure data. It may provide a reasonable bound for an estimate of a new system. It may also limit the focus on the accelerated testing to only the novel or new or high risk areas of the new design — given much of the design is (or may be) similar to past products.
  • Using a simple reliability block diagram or fault tree analysis model to assembly the estimates, test results, engineering stress/strength analysis (all better estimation tools then parts count, in my opinion) and calculate a system reliability estimate.
  • Using a risk of failure approach with FMEA and HALT to identify the likely failure mechanisms then characterize those mechanisms to determine their time to failure distributions. If there is one or a few dominant failure mechanisms, that work would provide a reasonable estimate of the system reliability.

In all cases focus on failure mechanisms and how the time to failure distribution changes given changes in stress / environment / use conditions. Monte Carlo may provide a suitable means to analysis a great mixture of data to determine an estimate. Use reliability, probability of success over a duration.

In short, do the work to understand the design, it’s weaknesses, the time to failure behavior under different use/condition scenarios, and make justifiable assumptions only when necessary.

Summary

We engage vendors to supply custom subsystems given their expertise and ability to deliver the units we need for our vehicle. We expect them to justify they meet reliability requirements in a rationale and defendable manner. While we do not want to dictate the approach tot he design or the estimate of reliability performance, we certainly have to judge the acceptability of the claims they meet the requirements.

What do you report when a customer asks if your product will meet the reliability requirements? Add to the list of possible approaches in the comments section below.

Related

How to Calculate MTBF

Questions to ask a vendor

MTBF: According to a Component Supplier

Filed Under: Articles, NoMTBF

by James Reyes-Picknell Leave a Comment

If You Want a Proactive Maintenance Program That Really Works, Then Reliability Centered Maintenance (RCM) Is The Way

If You Want a Proactive Maintenance Program That Really Works, Then Reliability Centered Maintenance (RCM) Is The Way

This article is Part Two in my three part series about “PM” programs.

Reliability Centered Maintenance (RCM) is the world’s leading method for identifying maintenance and other activities required to sustain reliable performance of physical assets. If you want a proactive maintenance program that really works, then Reliability Centered Maintenance is the most thorough approach you can take to get there. [Read more…]

Filed Under: Articles, Conscious Asset, on Maintenance Reliability Tagged With: RCM

by Mike Sondalini Leave a Comment

A Brief Introduction to Process Chemical Corrosion

A Brief Introduction to Process Chemical Corrosion

Chemical corrosion can destroy the containment materials in contact with a process.

Means exist to mitigate and even prevent chemical corrosion. This article focuses on several such methods.

Acceptable corrosion

At times chemical corrosion is acceptable and one need only allow for it by using thicker materials. An example is the storage of sulphuric acid in mild steel tanks at ambient conditions and concentrations [Read more…]

Filed Under: Articles, on Maintenance Reliability, Plant Maintenance Tagged With: corrosion

by Greg Hutchins 6 Comments

ISO 31000 Principles of Risk Management

ISO 31000 Principles of Risk Management

Guest Post by Greg Hutchins (first posted on CERM ® RISK INSIGHTS – reposted here with permission)

ISO 31000 is organized around 11 risk management principles.   A management principle refers to a fundamental idea, rule, or truth about a subject. ISO 31000 risk principles serve as the guideline, method, logic, design, and implementation for the risk management framework and its process. [Read more…]

Filed Under: Articles, CERM® Risk Insights, on Risk & Safety Tagged With: ISO 31000, risk management

by Fred Schenkelberg Leave a Comment

Take Action to Deal with Part Obsolescence

Take Action to Deal with Part Obsolescence

Even products with relatively quick design cycles and short stays in the market deal with part obsolescence.

Long design periods along with long durations in service or in production simply increases the chance that one or more parts will become obsolete.

Designing systems with part obsolescence in mind helps. Working with suppliers to select parts with many sources, with long-term plans to produce, and with long term commitments, all may help. Even then, companies change priorities, go out of business, or simply discontinue the part you need. [Read more…]

Filed Under: Articles, CRE Preparation Notes, Reliability in Design and Development Tagged With: Parts obsolescence management

by James Kovacevic 4 Comments

Quantify the Improvements (or Gaps) In Your Reliability

Quantify the Improvements (or Gaps) In Your Reliability

Using a Crow-AMSAA [Reliability Growth Analysis (RGA)] to Quantify Your Reliability Improvements (or Losses)

Part 3 of 7

67 rga graphImagine being able to predict the next time a failure will occur for a piece of equipment without a huge amount of work.  Wouldn’t it be nice to know the approximate point in time that a failure will occur on a critical piece of equipment?  It is possible, but I am not talking about using MTBF, as it is not a good measure (if you need to understand why, please visit http://www.NoMTBF.com).   What I am talking about is a Crow-AMSAA analysis. [Read more…]

Filed Under: Articles, Maintenance and Reliability, on Maintenance Reliability

  • « Previous Page
  • 1
  • …
  • 176
  • 177
  • 178
  • 179
  • 180
  • …
  • 215
  • Next Page »

Join Accendo

Receive information and updates about articles and many other resources offered by Accendo Reliability by becoming a member.

It’s free and only takes a minute.

Join Today

Recent Articles

  • test
  • test
  • test
  • Your Most Important Business Equation
  • Your Suppliers Can Be a Risk to Your Project

© 2025 FMS Reliability · Privacy Policy · Terms of Service · Cookies Policy