Accendo Reliability

Your Reliability Engineering Professional Development Site

  • Home
  • About
    • Contributors
  • Reliability.fm
    • Speaking Of Reliability
    • Rooted in Reliability: The Plant Performance Podcast
    • Quality during Design
    • Way of the Quality Warrior
    • Critical Talks
    • Dare to Know
    • Maintenance Disrupted
    • Metal Conversations
    • The Leadership Connection
    • Practical Reliability Podcast
    • Reliability Matters
    • Reliability it Matters
    • Maintenance Mavericks Podcast
    • Women in Maintenance
    • Accendo Reliability Webinar Series
  • Articles
    • CRE Preparation Notes
    • on Leadership & Career
      • Advanced Engineering Culture
      • Engineering Leadership
      • Managing in the 2000s
      • Product Development and Process Improvement
    • on Maintenance Reliability
      • Aasan Asset Management
      • AI & Predictive Maintenance
      • Asset Management in the Mining Industry
      • CMMS and Reliability
      • Conscious Asset
      • EAM & CMMS
      • Everyday RCM
      • History of Maintenance Management
      • Life Cycle Asset Management
      • Maintenance and Reliability
      • Maintenance Management
      • Plant Maintenance
      • Process Plant Reliability Engineering
      • ReliabilityXperience
      • RCM Blitz®
      • Rob’s Reliability Project
      • The Intelligent Transformer Blog
      • The People Side of Maintenance
      • The Reliability Mindset
    • on Product Reliability
      • Accelerated Reliability
      • Achieving the Benefits of Reliability
      • Apex Ridge
      • Metals Engineering and Product Reliability
      • Musings on Reliability and Maintenance Topics
      • Product Validation
      • Reliability Engineering Insights
      • Reliability in Emerging Technology
    • on Risk & Safety
      • CERM® Risk Insights
      • Equipment Risk and Reliability in Downhole Applications
      • Operational Risk Process Safety
    • on Systems Thinking
      • Communicating with FINESSE
      • The RCA
    • on Tools & Techniques
      • Big Data & Analytics
      • Experimental Design for NPD
      • Innovative Thinking in Reliability and Durability
      • Inside and Beyond HALT
      • Inside FMEA
      • Integral Concepts
      • Learning from Failures
      • Progress in Field Reliability?
      • R for Engineering
      • Reliability Engineering Using Python
      • Reliability Reflections
      • Testing 1 2 3
      • The Manufacturing Academy
  • eBooks
  • Resources
    • Accendo Authors
    • FMEA Resources
    • Feed Forward Publications
    • Openings
    • Books
    • Webinars
    • Journals
    • Higher Education
    • Podcasts
  • Courses
    • 14 Ways to Acquire Reliability Engineering Knowledge
    • Reliability Analysis Methods online course
    • Measurement System Assessment
    • SPC-Process Capability Course
    • Design of Experiments
    • Foundations of RCM online course
    • Quality during Design Journey
    • Reliability Engineering Statistics
    • Quality Engineering Statistics
    • An Introduction to Reliability Engineering
    • Reliability Engineering for Heavy Industry
    • An Introduction to Quality Engineering
    • Process Capability Analysis course
    • Root Cause Analysis and the 8D Corrective Action Process course
    • Return on Investment online course
    • CRE Preparation Online Course
    • Quondam Courses
  • Webinars
    • Upcoming Live Events
  • Calendar
    • Call for Papers Listing
    • Upcoming Webinars
    • Webinar Calendar
  • Login
    • Member Home

by Larry George Leave a Comment

Help! They Lost the Data

Help! They Lost the Data

What can we do without reliability function estimates? FMEA? FTA? RCA? RCM? Argue about MTBFs and availability? Weibull? Keep a low profile? Run Admirals’ tests? Look for a new, well-funded project far from the deliverable stage? 

Ask for field data; there should be enough to estimate reliability and make reliability-based decisions, even if some data are missing. Field data might even be population data!

Data Saga

I wanted to estimate the reliability and failure rate functions, for reasonable ages t, for all the hematology business unit’s products and parts. Textbooks say, estimate reliability functions from random samples of ages-at-failures {T(1), T(2),…,T(r)} and survivors’ lives. As usual, I didn’t have ages-at-failures. 

I could estimate failure rate function from ships and parts’ failure counts required by GAAP using the methods in “How Can You Estimate Reliability Functions Without Life Data?”, https://lucas-accendo-site-speed.sprod01.rmkr.net/?s=tribus/, and https://sites.google.com/site/fieldreliability/home/prodspec/.

In February I submitted a request to the MIS department for the hematology business unit’s products’ and parts’ installed base and failure counts. The MIS department “prioritized” my request. 

In July Eric from MIS said he’d start working on my request. He asked if it would be OK to give numbers failed at each age, in months 1-24? That’s grouped age-at-failure data. I thought, “Why do work so hard? I could use the Kaplan-Meier nonparametric reliability estimate on ages-at-failures, at least up to age 24 months.” I thanked Eric, grateful for anything. 

In October, Frank from MIS offered data by ”PRODCODE”, “SER#”, “PN”, “DESC”, “TRANSDATE”, “FLAGS”. In addition to failure counts at ages 1-24, Frank offered total failures at all ages greater 24 months grouped into the 25th month. “PRODCODE” and “TRANSDATE” indicated many products had been in service longer than 24 months, with some parts’ failures, usually for the first time. (Automotive aftermarket stores save parts’ sales data for two years, without parts’ ages-at-failures. They’re renewals or replacement parts https://lucas-accendo-site-speed.sprod01.rmkr.net/renewal-process-estimation-without-life-data/#more-443057/.)

Reliability Estimation from Grouped Life Data is Easy 

The installed base and failure data for months 1-24 go into a “Nevada” table for grouped failure data, https://lucas-accendo-site-speed.sprod01.rmkr.net/nevada-charts-gather-data/. I used the Kaplan-Meier nonparametric reliability estimator for ages 1-24, and Greenwood’s formula for variances (covariances are approximately zero!). I could forecast replacement requirements, recommend parts’ stock levels, do diagnostics, and make credible reliability predictions for new products from similar, old parts’ reliability estimates, for all 2537 hematology business unit’s parts. 

What should I do with the failures grouped into month 25, from products or parts older than 24 months? Who cares? Me! Why? That’s additional information! I wanted to detect premature wearout, which indicates possible design defect. (Failure rate function increases.) I also wanted to detect retirement so I don’t get stuck with obsolescent spares. (Failure rate function decreases.)

To forecast replacement requirements, I needed to estimate or extrapolate the failure rate function for ages greater than 24 months, because some products and their parts have ages greater than 24 months. 

Failure Rate Function Extrapolations

When I have had no information about older failures, I have extrapolated failure rate function estimates by regression. But Frank told me how many failures occurred after age 24 months, just not when. Why not extrapolate by maximizing likelihood, 

PRODUCT[(1-R(t))r(t)R(t)(n(t)-r(t)); t=1,2,…oldest],

where R(t) is the reliability function, r(t) is the number of failures of age t, and n(t) is the installed base of age t including ages beyond 24 months? That’s what the Kaplan-Meier estimator does, except that all I know is n(t), t=1,2,…,oldest and r(?) the sum of all failures at ages greater than 24 months.

How to model failures older than 24 months? Constant failure rate? Linear? Other? The choice should depend on how the failure rate function looks before age 24 months, the number of failures older than 24 months, and your experience. Wait, you say! Couldn’t the failure counts older than 24 months change the earlier reliability estimates, ages 1-24? Nope, maximizing log likelihood maximizes a sum by maximizing each summand. I checked reliability estimates; no difference. That’s enough proofiness for me. 

Constant Failure Rate? For older parts, make expected deaths older than 24 months equal the observed and reported sum r(?) of failures at ages greater than 24 months, by choice of a constant (actuarial) failure rate “a(25)” estimate = failures per month/number exposed; i.e.,

a(25) = r(?)/SUM[(t–24)*[N(t)–a(25)*E[N(t)]); t=25, 26,…,oldest], 

where N(t) is the ships in month t=25,26,…,oldest, and E[N(t)] is the average ships per month. Expected failures are

 SUM[N(s)*p(s)]*PRODUCT[(1–SUM[p(t)])/R(24)],

where the sum and the product run from s and t = 25 to the age of the oldest product, N(s) is the number shipped s months ago, p(s) is the probability age at failure is s months, and R(24) = P[life > 24] = 1 – SUM[p(t); t = 0, 1,…,24]. Set Expected failures equal to observed with a(t) = a(25) for all ages t > 24, where a(t) = p(t)/R(t), the conditional probability of failure in the next month given survival to age t.Table 1 Example: Constant failure rate for parts in a product 32 months old: The E[deaths] column is the actuarial failure rate a(25) times the numbers of survivors, and the survivors column is Ships N(t) minus E[failures] r(t). The last column a(25) is r(?) divided by the sum of the t*sum[N(t)] column.

Table 1

Other Failure Rate Models: Maximum likelihood chooses fractional ships after 24 months of age, constrained to equal the reported failure count after 24 months r(?), to make nonparametric estimates of the reliability and failure rate functions for ages up to the oldest unit in the installed base. 

I used Excel Solver to maximize likelihood; Excel blew up for the “Unconstrained” alternative, so I manually entered 1 failure in month 30 or “Limited” the failure rate to prevent #NUM! error. The maximum likelihood (lnL in Table 2) was achieved by the “Unconstrained” alternative with one failure in month 30. The failure rates indicate there was wearout, because the “Limited” and “Linear” alternatives also showed increasing failure rates.

Table 2 Example. Data are from some US postal service machines. There was 1 failure in months 25-30. Alternative failure rate models are: unconstrained, constant, limited, and linear. The alternatives postulate fractional failures at ages 25-30, and Solver maximizes log-likelihood (lnL) for reliability and failure rate function estimates. The constrained maximum likelihood failure rate estimates are in the last four columns.

Table 2

Free offer

These examples are not the only problem I’ve seen with grouped data. A sterile glove company’s [Terumo] customers batch failures and send them back whenever they feel like it. Imagine grouped failure counts with reporting delays so that the most recent counts are obviously under-reported [ReliaSoft]. Imagine sell-through time, the time from reported sale until first use [hematology business unit].   

If you have a problem with grouped failure counts, send pstlarry@yahoo.com your installed base by age and grouped ages at replacements, and I’ll send back the Kaplan-Meier estimate of reliability function, Greenwood’s estimator of its variance, estimate of the failure rate function, and alternative maximum likelihood estimators for the older, grouped data. 

Filed Under: Articles, on Tools & Techniques, Progress in Field Reliability?

About Larry George

UCLA engineer and MBA, UC Berkeley Ph.D. in Industrial Engineering and Operations Research with minor in statistics. I taught for 11+ years, worked for Lawrence Livermore Lab for 11 years, and have worked in the real world solving problems ever since for anyone who asks. Employed by or contracted to Apple Computer, Applied Materials, Abbott Diagnostics, EPRI, Triad Systems (now http://www.epicor.com), and many others. Now working on survival analysis, epidemiology, and their applications: epidemics, randomized clinical trials, risk-based inspection, and DoE for risk equity.

« Ten Ways to Improve Your Measurement Systems Assessments
Maintenance and Reliability Maturity – 2 »

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Articles by Larry George
in the Progress in Field Reliability? article series

Join Accendo

Receive information and updates about articles and many other resources offered by Accendo Reliability by becoming a member.

It’s free and only takes a minute.

Join Today

Recent Articles

  • test
  • test
  • test
  • Your Most Important Business Equation
  • Your Suppliers Can Be a Risk to Your Project

© 2025 FMS Reliability · Privacy Policy · Terms of Service · Cookies Policy