Accendo Reliability

Your Reliability Engineering Professional Development Site

  • Home
  • About
    • Contributors
  • Reliability.fm
    • Speaking Of Reliability
    • Rooted in Reliability: The Plant Performance Podcast
    • Quality during Design
    • Way of the Quality Warrior
    • Critical Talks
    • Dare to Know
    • Maintenance Disrupted
    • Metal Conversations
    • The Leadership Connection
    • Practical Reliability Podcast
    • Reliability Matters
    • Reliability it Matters
    • Maintenance Mavericks Podcast
    • Women in Maintenance
    • Accendo Reliability Webinar Series
  • Articles
    • CRE Preparation Notes
    • on Leadership & Career
      • Advanced Engineering Culture
      • Engineering Leadership
      • Managing in the 2000s
      • Product Development and Process Improvement
    • on Maintenance Reliability
      • Aasan Asset Management
      • AI & Predictive Maintenance
      • Asset Management in the Mining Industry
      • CMMS and Reliability
      • Conscious Asset
      • EAM & CMMS
      • Everyday RCM
      • History of Maintenance Management
      • Life Cycle Asset Management
      • Maintenance and Reliability
      • Maintenance Management
      • Plant Maintenance
      • Process Plant Reliability Engineering
      • ReliabilityXperience
      • RCM Blitz®
      • Rob’s Reliability Project
      • The Intelligent Transformer Blog
      • The People Side of Maintenance
      • The Reliability Mindset
    • on Product Reliability
      • Accelerated Reliability
      • Achieving the Benefits of Reliability
      • Apex Ridge
      • Metals Engineering and Product Reliability
      • Musings on Reliability and Maintenance Topics
      • Product Validation
      • Reliability Engineering Insights
      • Reliability in Emerging Technology
    • on Risk & Safety
      • CERM® Risk Insights
      • Equipment Risk and Reliability in Downhole Applications
      • Operational Risk Process Safety
    • on Systems Thinking
      • Communicating with FINESSE
      • The RCA
    • on Tools & Techniques
      • Big Data & Analytics
      • Experimental Design for NPD
      • Innovative Thinking in Reliability and Durability
      • Inside and Beyond HALT
      • Inside FMEA
      • Integral Concepts
      • Learning from Failures
      • Progress in Field Reliability?
      • R for Engineering
      • Reliability Engineering Using Python
      • Reliability Reflections
      • Testing 1 2 3
      • The Manufacturing Academy
  • eBooks
  • Resources
    • Accendo Authors
    • FMEA Resources
    • Feed Forward Publications
    • Openings
    • Books
    • Webinars
    • Journals
    • Higher Education
    • Podcasts
  • Courses
    • 14 Ways to Acquire Reliability Engineering Knowledge
    • Reliability Analysis Methods online course
    • Measurement System Assessment
    • SPC-Process Capability Course
    • Design of Experiments
    • Foundations of RCM online course
    • Quality during Design Journey
    • Reliability Engineering Statistics
    • Quality Engineering Statistics
    • An Introduction to Reliability Engineering
    • Reliability Engineering for Heavy Industry
    • An Introduction to Quality Engineering
    • Process Capability Analysis course
    • Root Cause Analysis and the 8D Corrective Action Process course
    • Return on Investment online course
    • CRE Preparation Online Course
    • Quondam Courses
  • Webinars
    • Upcoming Live Events
  • Calendar
    • Call for Papers Listing
    • Upcoming Webinars
    • Webinar Calendar
  • Login
    • Member Home

by Greg Hutchins Leave a Comment

Why Reliability Needs Risk Management to Succeed

Why Reliability Needs Risk Management to Succeed

Guest Post by John Ayers (first posted on CERM ® RISK INSIGHTS – reposted here with permission)

Most of my career was spent with the Department of Defense (DOD) industry. The many programs I worked on included a fairly difficult reliability requirement. I was taught that reliability is designed into a system. I learned that verifying a reliability requirement was done by analysis. But for the system reliability to succeed, you need to consider the manufacturing and installation of the system. This is when risk management comes into play to ensure system reliability requirements succeed. This paper explains why.

Reliability Requirement

A reliability requirement for a DOD development project typically is defined as availability.   Availability is the probability that a system will work as required during the period of a mission. The mission could be the 18-hour span of an aircraft flight. The mission period could also be the 3 to 15-month span of a military deployment. Availability includes non-operational periods associated with reliability, maintenance, and logistics.
There are three qualifications that need to be met for a system to be available:

  1. Functioning system not out of service for repairs or inspections
  2. Functioning under normal conditions and operates in an ideal setting at an expected rate
  3. Functioning when needed and operational at any time…

 Reliability Design Analysis

The starting point of a reliability analysis is to create a model. The model includes every major

component in the system as well as their reliability terms.  The major terms are:

  1. Mean Time Between Failure (MTBF) is a reliability term used to provide the amount of failures per million hours for a product.
  2. Mean Time To Repair (MTTR) is the time needed to repair a failed hardware module. In an operational system, repair generally means replacing a failed hardware part.
  3. Mean Time To Failure (MTTF) is a basic measure of reliability for non-repairable It is the mean time expected until the first failure of a piece of equipment.

The model is run numerous times. Each iteration involves making changes to the model such as: adding redundancy; eliminating single point failure point; and altering reliability terms until the output of the model shows the availability requirement is met.  The reliability requirements for the major components flow out of the mode

At this point, there is an analysis that shows the availability requirement has been met. But this is just a paper verification of meeting the requirement. We still have to manufacture, assemble, install and test the system and things can go wrong as my example shows.

Example 

I was the lead for a red team review (independent review) of a preliminary design review (PDR) for a Radome. A Radome is a fabric dome that covers an antenna or radar system to protect it from the environment. The Radome had to survive a 200 miles per hour wind, a very extreme requirement.

The PDR did not go well. The availability analysis did not show the requirement was met. It was based on using the best known fabric for the Radome A new fabric invention would be needed to meet the requirement. The red team failed the review which meant they had to go back to the drawing board.  The red team was disbanded. About a year and a half later, I heard the Radome (which was installed) ruptured during acceptance test. The investigation of the failure revealed that the Radome was over tested (too much pressure) and was not fabricated properly as the cause of the rupture. The cause of failure was due to mistakes performed during the manufacture and installation phases. I never found out how they met the availability requirement. My guess is they reduced the requirement.

How Risk Management Could Have Helped

If a risk assessment was conducted for the manufacturing phase of the project. Most likely it would have identified a number of risks associated with the fabrication of the Radome. The risks would have been analyzed, handled, and monitored/controlled. The same can be said for the installation phase. Assuming the mitigation plans prevented the fabrication and over testing mistake, then the reliability analysis could have been verified

Lessons Learned

I will never know if the availability requirement was met when tested. The analysis showed it was met but due to manufacturing and installation mistakes it was not proven.  I think the main lesson learned is to always conduct a risk assessment for all phases of the Radome (in this case).  This includes doing one for the design assumptions as well. Many times, a bad design assumption causes project failure.

 Summary

Reliability needs risk management to be successful. I think the example shows that. I have many more examples like it for a later time.

Bio:

Currently John is an author, writer and consultant. He authored a book entitled ‘Project Risk Management. He has written numerous risk papers and articles. He writes a risk column for CERM.

John earned a BS in Mechanical Engineering and MS in Engineering Management from Northeastern University. He has extensive experience with commercial and DOD companies. He is a member of PMI (Project Management Institute). John has managed numerous large high technical development programs worth in excessive of $100M. He has extensive subcontract management experience domestically and foreign.  John has held a number of positions over his career including: Director of Programs; Director of Operations; Program Manager; Project Engineer; Engineering Manager; and Design Engineer.  He has experience with: design; manufacturing; test; integration; subcontract management; contracts; project management; risk management; and quality control.  John is a certified six sigma specialist, and certified to level 2 EVM (earned value management).https://projectriskmanagement.info/

If you want to be a successful project manager, you may want to review the framework and cornerstones in my book. The book is innovative and includes unique knowledge, explanations and examples of the four cornerstones of project risk management. It explains how the four cornerstones are integrated together to effectively manage the known and unknown risks on your project.

Filed Under: Articles, CERM® Risk Insights, on Risk & Safety

About Greg Hutchins

Greg Hutchins PE CERM is the evangelist of Future of Quality: Risk®. He has been involved in quality since 1985 when he set up the first quality program in North America based on Mil Q 9858 for the natural gas industry. Mil Q became ISO 9001 in 1987

He is the author of more than 30 books. ISO 31000: ERM is the best-selling and highest-rated ISO risk book on Amazon (4.8 stars). Value Added Auditing (4th edition) is the first ISO risk-based auditing book.

« Standard Deviation versus Standard Error
Uncle Pareto »

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

CERM® Risk Insights series Article by Greg Hutchins, Editor and noted guest authors

Join Accendo

Receive information and updates about articles and many other resources offered by Accendo Reliability by becoming a member.

It’s free and only takes a minute.

Join Today

Recent Articles

  • test
  • test
  • test
  • Your Most Important Business Equation
  • Your Suppliers Can Be a Risk to Your Project

© 2025 FMS Reliability · Privacy Policy · Terms of Service · Cookies Policy