Accendo Reliability

Your Reliability Engineering Professional Development Site

  • Home
  • About
    • Contributors
  • Reliability.fm
    • Speaking Of Reliability
    • Rooted in Reliability: The Plant Performance Podcast
    • Quality during Design
    • Way of the Quality Warrior
    • Critical Talks
    • Dare to Know
    • Maintenance Disrupted
    • Metal Conversations
    • The Leadership Connection
    • Practical Reliability Podcast
    • Reliability Matters
    • Reliability it Matters
    • Maintenance Mavericks Podcast
    • Women in Maintenance
    • Accendo Reliability Webinar Series
  • Articles
    • CRE Preparation Notes
    • on Leadership & Career
      • Advanced Engineering Culture
      • Engineering Leadership
      • Managing in the 2000s
      • Product Development and Process Improvement
    • on Maintenance Reliability
      • Aasan Asset Management
      • AI & Predictive Maintenance
      • Asset Management in the Mining Industry
      • CMMS and Reliability
      • Conscious Asset
      • EAM & CMMS
      • Everyday RCM
      • History of Maintenance Management
      • Life Cycle Asset Management
      • Maintenance and Reliability
      • Maintenance Management
      • Plant Maintenance
      • Process Plant Reliability Engineering
      • ReliabilityXperience
      • RCM Blitz®
      • Rob’s Reliability Project
      • The Intelligent Transformer Blog
      • The People Side of Maintenance
      • The Reliability Mindset
    • on Product Reliability
      • Accelerated Reliability
      • Achieving the Benefits of Reliability
      • Apex Ridge
      • Metals Engineering and Product Reliability
      • Musings on Reliability and Maintenance Topics
      • Product Validation
      • Reliability Engineering Insights
      • Reliability in Emerging Technology
    • on Risk & Safety
      • CERM® Risk Insights
      • Equipment Risk and Reliability in Downhole Applications
      • Operational Risk Process Safety
    • on Systems Thinking
      • Communicating with FINESSE
      • The RCA
    • on Tools & Techniques
      • Big Data & Analytics
      • Experimental Design for NPD
      • Innovative Thinking in Reliability and Durability
      • Inside and Beyond HALT
      • Inside FMEA
      • Integral Concepts
      • Learning from Failures
      • Progress in Field Reliability?
      • R for Engineering
      • Reliability Engineering Using Python
      • Reliability Reflections
      • Testing 1 2 3
      • The Manufacturing Academy
  • eBooks
  • Resources
    • Accendo Authors
    • FMEA Resources
    • Feed Forward Publications
    • Openings
    • Books
    • Webinars
    • Journals
    • Higher Education
    • Podcasts
  • Courses
    • 14 Ways to Acquire Reliability Engineering Knowledge
    • Reliability Analysis Methods online course
    • Measurement System Assessment
    • SPC-Process Capability Course
    • Design of Experiments
    • Foundations of RCM online course
    • Quality during Design Journey
    • Reliability Engineering Statistics
    • Quality Engineering Statistics
    • An Introduction to Reliability Engineering
    • Reliability Engineering for Heavy Industry
    • An Introduction to Quality Engineering
    • Process Capability Analysis course
    • Root Cause Analysis and the 8D Corrective Action Process course
    • Return on Investment online course
    • CRE Preparation Online Course
    • Quondam Courses
  • Webinars
    • Upcoming Live Events
  • Calendar
    • Call for Papers Listing
    • Upcoming Webinars
    • Webinar Calendar
  • Login
    • Member Home

by Fred Schenkelberg 2 Comments

Program Elements Part 2

Program Elements Part 2

This is a two part series where I outline the basic elements of creating and supporting a reliability program.

Test to Failure to Discover Design Weaknesses

As physical hardware becomes available, step stress testing and Highly Accelerated Life Testing (HALT) should be used to discern the weak links in the design. There is no room here for success testing; one must test to failure. Focus on those items that are NUD: New to the organization, Unique to this product, and Difficult to design and / or manufacture.

Start at the lowest level subassembly that is conveniently testable and continue later as higher levels of integration become available. The lower levels require more electrical, mechanical, and software fixturing to test while under stress, but the stress levels can go farther. More fully integrated products are more easily tested and require less fixturing, but the stress level is limited by the weakest subassembly.

Each failure should be investigated to understand the root cause, no matter what kind of stress or stress level caused it. Until root cause understanding exists, one cannot make any estimates of the relevance of the failure mode. Indeed, some things do not have to be fixed, but it is often easier to fix the issue than to establish whether it can be safely ignored. Repeat HALT with each round of prototypes.

Use Manufacturing Screening to Ensure Early Life Success

Some components are weakened by anomalies in their manufacturing process or damage in shipping, storage, and handling. These defects are latent (hidden) and the parts will test well in manufacturing, but fail early (the first 90-days) in the product’s life, typically because they contain stress concentrators.

After corrective actions from HALT and step stress testing have established good design margin, manufacturing screening will be able to cause weak components to fail without removing significant fatigue life from the good components. In this way latent defects can be eliminated before shipping the device. Keep in mind that without first having a rugged design, manufacturing screening may decrease life and increase warranty.

Run-in, Burn-in, Environmental Stress Screening (ESS), and Highly Accelerated Stress Screening (HASS) are increasingly sophisticated methods of precipitating and detecting these hidden defects. After precipitation by stress, different detection screens must be used with appropriate testing to locate the (now) latent or visible flaw. Proof of Screen is run to ensure the trial regimen is tough enough to precipitate defects, and Safety of Screen is done to ensure enough Fatigue Life is left.

Validate the Design after Design Verification and Transfer to Manufacturing

Using specimens from the actual manufacturing process, subject the product to the suite of required environmental and regulatory tests. These are “success tests,” as the objective is to pass these qualification tests. This assures the baseline product as transferred to manufacturing will meet customer needs.

Ongoing Reliability Test (ORT)

Many changes will enter the production process: at top-level assembly, at subassembly suppliers, in the components, and during transportation and storage. Minor changes accumulate and often reliability will invisibly slip away as daily operations focus on functionality and yield. If design margin is lost, manufacturing screening that was benign to the product before may start to consume enough fatigue life that end-of-life failures start to show up in warranty.

Periodic testing to failure using step-stress testing or HALT is a way to measure the design margin and find weaknesses that may have slipped in. ORT may also include periodic cycle testing to monitor wear-out phenomena. As earlier in discovery testing, ORT can be done on whole products or focused on key components and subassemblies. Mechanical, electrical, and software fixturing from earlier discovery testing may be re-used with appropriate improvements for routine convenience. Ongoing Reliability Tests should provide early warning well beyond specifications and should not degenerate into acceptance tests.

—

That covers the elements – anything missing?

Filed Under: Articles, CRE Preparation Notes, Reliability Management Tagged With: Elements of a Reliability Program

About Fred Schenkelberg

I am the reliability expert at FMS Reliability, a reliability engineering and management consulting firm I founded in 2004. I left Hewlett Packard (HP)’s Reliability Team, where I helped create a culture of reliability across the corporation, to assist other organizations.

« ALT Value
Where MTBF falls short »

Comments

  1. R N A Kumar Kuncham says

    October 12, 2017 at 4:27 AM

    Please help me by letting know about what should be the sample size and interval for Ongoing reliability test based on production volume?

    Reply
    • Fred Schenkelberg says

      October 12, 2017 at 8:15 AM

      Hi Kumar, check out the article on ORT https://lucas-accendo-site-speed.sprod01.rmkr.net/introduction-ongoing-reliability-testing/ which explains the many factors to consider when setting a sample size… it rarely has anything to do with production volume, btw. cheers, Fred

      Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

CRE Preparation Notes

Article by Fred Schenkelberg

Join Accendo

Join our members-only community for full access to exclusive eBooks, webinars, training, and more.

It’s free and only takes a minute.

Get Full Site Access

Not ready to join?
Stay current on new articles, podcasts, webinars, courses and more added to the Accendo Reliability website each week.
No membership required to subscribe.

[popup type="" link_text="Get Weekly Email Updates" link_class="button" ]

[/popup]

  • CRE Preparation Notes
  • CRE Prep
  • Reliability Management
  • Probability and Statistics for Reliability
  • Reliability in Design and Development
  • Reliability Modeling and Predictions
  • Reliability Testing
  • Maintainability and Availability
  • Data Collection and Use

© 2025 FMS Reliability · Privacy Policy · Terms of Service · Cookies Policy