Accendo Reliability

Your Reliability Engineering Professional Development Site

  • Home
  • About
    • Contributors
  • Reliability.fm
    • Speaking Of Reliability
    • Rooted in Reliability: The Plant Performance Podcast
    • Quality during Design
    • Way of the Quality Warrior
    • Critical Talks
    • Dare to Know
    • Maintenance Disrupted
    • Metal Conversations
    • The Leadership Connection
    • Practical Reliability Podcast
    • Reliability Matters
    • Reliability it Matters
    • Maintenance Mavericks Podcast
    • Women in Maintenance
    • Accendo Reliability Webinar Series
  • Articles
    • CRE Preparation Notes
    • on Leadership & Career
      • Advanced Engineering Culture
      • Engineering Leadership
      • Managing in the 2000s
      • Product Development and Process Improvement
    • on Maintenance Reliability
      • Aasan Asset Management
      • AI & Predictive Maintenance
      • Asset Management in the Mining Industry
      • CMMS and Reliability
      • Conscious Asset
      • EAM & CMMS
      • Everyday RCM
      • History of Maintenance Management
      • Life Cycle Asset Management
      • Maintenance and Reliability
      • Maintenance Management
      • Plant Maintenance
      • Process Plant Reliability Engineering
      • ReliabilityXperience
      • RCM Blitz®
      • Rob’s Reliability Project
      • The Intelligent Transformer Blog
      • The People Side of Maintenance
      • The Reliability Mindset
    • on Product Reliability
      • Accelerated Reliability
      • Achieving the Benefits of Reliability
      • Apex Ridge
      • Metals Engineering and Product Reliability
      • Musings on Reliability and Maintenance Topics
      • Product Validation
      • Reliability Engineering Insights
      • Reliability in Emerging Technology
    • on Risk & Safety
      • CERM® Risk Insights
      • Equipment Risk and Reliability in Downhole Applications
      • Operational Risk Process Safety
    • on Systems Thinking
      • Communicating with FINESSE
      • The RCA
    • on Tools & Techniques
      • Big Data & Analytics
      • Experimental Design for NPD
      • Innovative Thinking in Reliability and Durability
      • Inside and Beyond HALT
      • Inside FMEA
      • Integral Concepts
      • Learning from Failures
      • Progress in Field Reliability?
      • R for Engineering
      • Reliability Engineering Using Python
      • Reliability Reflections
      • Testing 1 2 3
      • The Manufacturing Academy
  • eBooks
  • Resources
    • Accendo Authors
    • FMEA Resources
    • Feed Forward Publications
    • Openings
    • Books
    • Webinars
    • Journals
    • Higher Education
    • Podcasts
  • Courses
    • 14 Ways to Acquire Reliability Engineering Knowledge
    • Reliability Analysis Methods online course
    • Measurement System Assessment
    • SPC-Process Capability Course
    • Design of Experiments
    • Foundations of RCM online course
    • Quality during Design Journey
    • Reliability Engineering Statistics
    • Quality Engineering Statistics
    • An Introduction to Reliability Engineering
    • Reliability Engineering for Heavy Industry
    • An Introduction to Quality Engineering
    • Process Capability Analysis course
    • Root Cause Analysis and the 8D Corrective Action Process course
    • Return on Investment online course
    • CRE Preparation Online Course
    • Quondam Courses
  • Webinars
    • Upcoming Live Events
  • Calendar
    • Call for Papers Listing
    • Upcoming Webinars
    • Webinar Calendar
  • Login
    • Member Home

by Larry George 1 Comment

Uncertainty in Population Estimates?

Uncertainty in Population Estimates?

Dick Mensing said, “Larry, you can’t give an estimate without some measure of its uncertainty!” For seismic risk analysis of nuclear power plants, we had plenty of multivariate earthquake stress data but paltry strength-at-failure data on safety-system components. So we surveyed “experts” for their opinions on strengths-at-failures distribution parameters and for the correlations between pairs of components’ strengths at failures. 

If you make estimates from population field reliability data, do the estimates have uncertainty? If all the data were population lifetimes or ages-at-failures, estimates would have no sample uncertainty, perhaps measurement error. Estimates from population field reliability data have uncertainty because typically some population members haven’t failed. If field reliability data are from renewal or replacement processes, some replacements haven’t failed and earlier renewal or replacement counts may be unknown. Regardless, estimates from population data are better than estimates from a sample, even if the population data is ships and returns counts!

Walter Shewhart’s Rule #1 

“Original data should be presented in a way that will preserve the evidence in the original data for all the predictions assumed to be useful.” Nonparametric field reliability estimates preserve all the evidence in field reliability data: lifetimes, censored or not, or ships (installed base, production, sales, etc.) and returns (complaints, repairs, replacement, failures, even spare parts’ sales) counts.

Censored lifetime data are statistically sufficient to make nonparametric maximum-likelihood reliability or failure rate function estimates (Kaplan-Meier, Nelson-Aalen). Periodic ships  and returns counts are also statistically sufficient [George and Agrawal, Oscarsson and Hallberg]. Ships and returns counts are population data, and they in data required by GAAP (Generally Accepted Accounting Principles). You may have to work to convert periodic revenue and price data into ships and returns counts (ships = sales revenue/price) and returns could be spare parts’ sales revenue/price). Use gozinto theory and bills-of-materials to convert product installed base by age into parts’ installed base by age [https://lucas-accendo-site-speed.sprod01.rmkr.net/gozinto-theory-parts-installed-base/#more-417514].

Do the best you can with the data you have. Fred Schenkelberg wrote, “I said it before, only gather the data you need,” [https://insights.cermacademy.com/398-environmental-and-use-manual-fred-schenkelberg/]. 

Unknown Knowns? Did You Check Reliability Model Assumptions?

Did you assume constant failure rate and exponential reliability, Weibull, normal distribution, statistical independence, stationarity? What if you assume a reliability function and it’s not? Are your assumptions justified by physics or other structural evidence? Nonparametric estimates require no distribution assumptions and preserve all evidence in field reliability data.

Known Unknowns? Nonparametric Estimates Require Fewer Assumptions!

Nonparametric estimators from grouped, censored lifetime data require no unwarranted assumptions except independence and identically distributed ages at failures or censoring times (Kaplan-Meier, Nelson-Aalen). Those nonparametric estimators have known variance whether from sample or population data (Greenwood).

Nonparametric estimators from population ships and returns counts require an assumption, that ships counts are a nonstationary Poisson process [George]. The model for ships and returns counts is an M(t)/G/infinity self-service system. M(t) represents ships as nonstationary Poisson inputs, and G stands for the-service time distribution (1-reliability function). Output of an M(t)/G/infinity system is also a Poisson process [Mirasol]. That nonstationary Poisson input assumption of independent< Poisson increments is untestable. It’s impossible to test hypotheses with samples of size 1 (Poisson period ships). Nonparametric reliability or failure rate function estimates from population periodic ships and returns counts also have known variance and covariance. 

Which would you prefer: lifetime data (grouped as in Table 1) or population ships and returns counts from data required by GAAP? It costs $$$ and time to track individual units by name or serial number to get lifetime data, even a sample. 

Table 1. [http://www.weibull.com/hotwire/issue119/relbasics119.htm] Nevada table of monthly ships (2ndcolumn) and grouped returns counts for input to Kaplan-Meier reliability function estimator. (It looks like Nevada turned on its side.)

WeekShipsMar-10Apr-10May-10Jun-10Jul-10
Mar-10162301357
Apr-103723 02711
May-101319  003
Jun-103600   02
Jul-103298    0
Etc.      
Returns 0151223

The ships column and returns in the bottom row of table 1 contain statistically sufficient data to make nonparametric reliability function estimates. Figure 1 shows the maximum likelihood Kaplan-Meier and the maximum likelihood reliability estimates from ships and returns counts. They don’t agree, because there is more information in the grouped returns counts in the body of table 1 than in the bottom row. Is more information worth the cost of identifying every failure, survivor, or return by serial number, month of shipment, and month of failure?

Figure 1. Nonparametric reliability function estimates: Kaplan-Meier from the grouped returns counts in the table body (KMAll) and the maximum likelihood (MLEAll) from ships and returns counts.
Figure 1. Nonparametric reliability function estimates: Kaplan-Meier from the grouped returns counts in the table body (KMAll) and the maximum likelihood (MLEAll) from ships and returns counts.

Estimator variance (or standard deviation) is a measure of precision. Greenwoood’s variance estimates are for the Kaplan-Meier reliability estimator [Wellner (Greenwood)]. They are equivalent to the Cramer-Rao lower bounds on the estimator variance-covariance matrix, which are minus the inverse of (the expected value of) the second derivative matrix of the log likelihood function (aka Fisher Information matrix) [https://en.wikipedia.org/wiki/Cram%C3%A9r%E2%80%93Rao_bound]. Maximum likelihood estimators achieve the Cramer-Rao lower bound asymptotically, under some conditions.

The Cramer-Rao formula is used to estimate lower bounds on the variance-covariance matrix of nonparametric reliability estimates from ships and returns counts in table 2. Please refer to “Random-Tandem Queues…” chapter 5 for more comparisons [George 2019]. I also need to estimate the variance-covariance matrix of actuarial failure rates, because the variance of an actuarial forecast, SUM[a(s)]n(t-s), s=1,2,…,t], depends on the covariances of actuarial rate estimates, VAR[actuarial forecast]= SUM[VAR[a(s)]n(t-s)^2, s=1,2,…,t]+2*SUM[COVAR[a(s),a(t)]n(t-s)n(t), s<t=1,2,…,t]. The Cramer-Rao covariances of the Kaplan-Meier reliability function estimates are asymptotically zero but not the covariances of actuarial rate estimates from grouped life data (Kaplan-Meier) or from ships and returns counts, because the actuarial rates are a(s)=p(s)/(1-R(s)) s=1,2,…,k where p(s) and R(s) are estimates of the probability density and reliability functions. 

Table 2 contains the reliability estimates in figure 1 and their standard deviations. The reliability estimates from grouped life data (Kaplan-Meier) and maximum likelihood estimates for ships and returns counts are pretty close (figure 1). For ships and returns counts, I assumed Poisson ships at constant rate equal to average of ships in first three periods. (That’s also the maximum likelihood estimator.). I will  compute more of the Cramer-Rao Fisher matrixes if you want. (Mathematica computes the Fisher information matrix for probability estimates g(1), g(2), and g(3), for reliability estimates R(1)=1-g(1), R(2)=1-g(1)-g(2), and R)3)=1-g(1)-g(2)-g(3). So the reliability estimates’ variances are Var(R(1))=Var(g(1)), Var(R(2))=Var(g(1))+Var(g(2))+2Cov(g(1),g(2)), etc. The covariances were small.

The reliability estimates from grouped life data (Kaplan-Meier) and maximum likelihood estimates from ships and returns counts are pretty close (figure 1). The (Greenwood) standard deviations of the Kaplan-Meier estimators are less than the standard deviations of the maximum likelihood estimators from ships and returns counts, because grouped lifetime data contains more information than ships and returns counts. Is the additional information worth its cost? What if you don’t have grouped lifetime data in Nevada table? GAAP requires information containing population ships and returns counts.

Table 2. Alternative Kaplan-Meier reliability estimates and their standard deviations from ships and grouped lifetimes vs. maximum likelihood reliability estimates from ships and returns counts. The Fisher matrix standard deviations

Age, MonthsKaplan-Meier reliabilityGreenwood std. DeviationMax. likelihood reliabilityFisher matrix std. Deviation
11010.000266
20.99960.000120.99970.000376
30.99770.000330.99650.001592
40.99460.000540.9965TBD
50.99020.000760.9930TBD
60.98470.000990.9780TBD
70.97790.001280.9780TBD
80.97010.001660.9780TBD
90.96110.002090.9780TBD
Entropy0.2473 0.1748 

Some people compare the entropy (Shannon information) in the alternative estimates [Bjorkman, Du et al.]. Entropy measures the information contained in a reliability function. Myron Tribus [https://en.wikipedia.org/wiki/Myron_Tribus/] advocated maximum-entropy reliability estimation after teaching thermodynamics and entropy to UCLA engineers like me. “The principle of MaxEnt states that the most probable distribution function is the one that maximizes the entropy given testable information,” [Du et al.]. (I believe the authors mean “testable information” to represent assumptions about the failure rate functions such as constant, bathtub, increasing, etc.) Maximum entropy is “a methodology to determine test value by estimating the amount of uncertainty reduction a particular test is expected to provide using Shannon’s Information Entropy as a basis for the estimate,” [Bjorkman] Entropy also measures the information in alternative data: grouped returns counts (Nevada table), sample or population, vs. population ships and returns counts.

The entropies at the bottom of table 2 are the sums of g(j)ln(g(j)), where g(j) is the discrete probability density function corresponding to the Kaplan-Meier or maximum likelihood reliability function estimates in table 2. There is more entropy (information) in the Kaplan-Meier estimate than in the maximum likelihood estimate. How much is the additional information worth, and how does additional information compare with variance reductions?

Empirical Comparison of Estimators and Their Uncertainty

Consider comparing estimators’ using broom chart input data. Broom chart input data is data from all periods as it accumulated: first period, first two periods,…, all but latest period, all periods. Comparing broom chart estimators’ standard deviation and entropy shows the effects of change with time and shows whether the reliabilities of later cohorts differ from earlier. The standard deviations of broom chart input data sets are not the same as the (square roots of) Cramer-Rao bounds, but they show the evolution of information as it accumulated. Figures 2 and 3 show the reliability function broom charts from Nevada table of grouped failure counts and from ships and returns counts. 

Figure 2. Broom chart of Kaplan-Meier reliability estimates from successively earlier data sets; i.e. smaller Nevada tables with fewer time periods.
Figure 2. Broom chart of Kaplan-Meier reliability estimates from successively earlier data sets; i.e. smaller Nevada tables with fewer time periods. 
Figure 3. Broom chart of nonparametric maximum likelihood reliability estimates from successively earlier ships and returns counts data sets. Longer lines are from larger, more complete data sets, including more recent data.
Figure 3. Broom chart of nonparametric maximum likelihood reliability estimates from successively earlier ships and returns counts data sets. Longer lines are from larger, more complete data sets, including more recent data.

 

Table 3. Standard deviations of reliability estimates from broom chart data sets. Estimates disagree, with more entropy (information) in the maximum likelihood estimates for some of the older ages. 

Age, monthsKaplan-MeierMax. Likelihood
100
27.7E-051.2E-16
36E-050.00012
43E-051.2E-16
58.4E-050
60.000110.00354
70.000120.00634
80.005040.00015
90.008360

Table 4. Broom chart reliability entropy estimates from broom chart data sets. Entropy decreases as data sets become smaller and earlier. The MLE entropy estimates don’t decrease as much the K-M entropy. 

 AllAll-lastAll-2Etc.   First 2
K-M entropy0.29540.28690.22460.14150.101340.06650.03840.0182
MLE entropy0.17480.103340.10420.15460.12310.04040.02050.0221

Uncertainty Quantification Includes Extrapolation

Dick Mensing suggested that I make nonparametric reliability and actuarial failure rate estimates by least squares from ships and returns counts. I.e., minimize the sums of squared differences between observed returns and actuarial hindcasts of returns, SUM[a(s)n(t-s), s=1,2,…,t], where a(s) is actuarial failure rate at age s and n(t-s) is the installed base of age t-s. See “Random-Tandem Queues…” for comparison of maximum likelihood and least squares estimators without life data [George 2018]. Least-squares estimates have asymptotic properties similar to Cramer-Rao bounds on variance: Gauss-Markov theorem, martingale central limit theorem, etc. 

So I made a nonparametric least squares reliability estimates from the same ships and returns counts data as in table 1. Its reliability estimate (R(t) lse) is close to the Kaplan-Meier estimate. Its entropy is more than the entropy of the maximum likelihood Kaplan-Meier estimate and entropy from ships and returns counts 0.258 (LSE) vs. 0.247 (Kaplan-Meier). Figure 4 shows all three estimators. I use least-squares estimators for renewal processes from periodic ships and replacements, without knowledge of prior replacement counts [https://lucas-accendo-site-speed.sprod01.rmkr.net/renewal-process-estimation-without-life-data/, George 2002]

Figure 4. Nonparametric reliability function estimates.
Figure 4. Nonparametric reliability function estimates.

I use regression to extrapolate least-squares actuarial rate estimates, because regression also uses least squares. The uncertainties in regression extrapolations are known: error bands, tolerance limits, prediction limits [https://lucas-accendo-site-speed.sprod01.rmkr.net/podcast/sor/sor-807-confidence-and-tolerance-intervals/#more-500434].

Uncertainty in Extrapolations

What if you need to extrapolate estimates to ages beyond the oldest failure in your sample or population? You need to extrapolate actuarial failure rates to quantify the uncertainty in actuarial failure forecasts, for stocking spares and planning maintenance to meet service level or backorder requirements and to plan for retirement or obsolescence.

A few years ago, there were publications about estimating actuarial rates for really old people [https://slidetodoc.com/mortality-trajectories-at-very-old-ages-actuarial-implications-2/]. It’s not easy to find samples, because really old people are scarce. Population ships and returns counts help. When I worked in the automotive aftermarket, parts’ dealers were delighted to have age-based parts’ forecasts for old Volkswagens still in circulation; they were based on population reliability statistics from vehicle installed base by age and monthly parts’ sales.

Reliability and failure rate function estimators quantify uncertainty up to the age(s) of the oldest failures in the sample or population. What is the uncertainty in extrapolation? What is the uncertainty in functions of the estimators such as actuarial forecasts SUM[a(s)n(t-s), s=1,2,…,t+1, t+2,…] beyond age of the oldest failure?

Extrapolate actuarial rates with Excel LINEST() regression and FORECAST.ETS.CONFINT() confidence interval. LINEST() produces a measure “R-squared,” “The coefficient of determination. Compares estimated and actual y-values.” It is the squared correlation between observed and predicted actuarial rates. I computed it for all subsets of estimated actuarial rates from all, all minus first,…,most recent,, and the largest R-squared was for all. FORECAST.ETS.CONFINT() gave an 95% upper confidence limit (UCL) for one-period extrapolation a(11) and the actuarial forecast UCL of 178 on returns in period 11, assuming ships of 3766 in period 11. 

Table 5. Least-squares estimates and 95% UCL extrapolation of actuarial rate a(11) and forecast of expected returns in future period 11

PeriodShipsReturnsa(s)E[Returns]
1162308.1E-050.13
2372310.000270.74
3131950.00214.52
43600120.0031413.57
53298230.0039422.09
61333380.0062537.51
71584550.0053355.8367
84508720.0094371.52
94463960.0057596.05
1031761260.0192125.97
113766?0.00619180

If you want to forecast returns more than one period ahead, then you need simultaneous confidence bands on the forecast actuarial rates [Hall and Wellner] and future ships. SAS PROC LIFETEST computes confidence bands on Kaplan-Meier reliability function estimates but not on the maximum likelihood estimates from population ships and returns counts, because SAS doesn’t know how to estimate reliability from ships and returns counts. 

Recommendations?

Preserve all relevant information in field reliability data. Use it. Test assumptions. Quantify uncertainty as well as you can. There are lots of ways to express uncertainty. I prefer data-based and physics based statistical uncertainty. However, some field reliability uncertainty is human-based, so take human factors into account, quantitively if possible. If you use subjective opinion surveys use https://en.wikipedia.org/wiki/Dempster%E2%80%93Shafer_theory or expert questionnaires and salt the surveys with known answers {George and Mensing].

If you would like the workbook used for preparing this article or if you want the Mathematica workbooks for computing the Fisher information matrixes and standard deviations for maximum likelihood estimators and actuarial forecasts from ships and returns counts, ask pstlarry@yahoo.com. 

References

Eileen A. Bjorkman “Test and Evaluation Resource Allocation Using Uncertainty Reduction as a Measure of Test Value,” PhD thesis, George Washington Univ., August 2012

Yi-Mu Du, Yu-Han Ma, Yuan-Fa Wei, and Xuefei Guan “Maximum Entropy Approach to Reliability,” Physical Review E101(1), DOI:10.1103/PhysRevE.101.012106, January 2020

L. L. George and A. Agrawal “Estimation of a Hidden Service Time Distribution for an M/G/Infinity Service System,” Nav. Res. Log. Quart., vol. 20, no. 3, pp. 549-555 , 1973

L. L. George “Renewal Estimation Without Renewal Counts,” INFORMS conference, San Jose, CA, Nov. 2002

L. L. George “Random-Tandem Queues and Reliability Estimation, Without Life Data,” https://drive.google.com/file/d/1Ua8qJ5XeTMbsXKQ2CskGL-MMgslO9OWu/view, Dec. 2019

L. L. George and Richard W. Mensing “Using Subjective Percentiles and Test Data for Estimating Fragility Functions,” Proceedings, Department of Energy Statistical Symposium, Berkeley, CA, Oct. 1980

W. J. Hall and Jon A. Wellner “Confidence Bands for a Survival Curve from Censored Data.” Biometrika, vol. 67, no. 1, pp. 133–43, https://doi.org/10.2307/2335326. Accessed 22 Nov. 2022, 1980

Noel M. Mirasol, “The Output of an M/G/Infinity, Queuing System Is Poisson,” Operations Research, Vol. 11, No. 2, pp. 282-284, Mar.-Apr. 1963

Partric Oscarsson and Örjan Hallberg “ERIVIEW 2000 – A Tool for the Analysis of Field Statistics,” Ericsson Telecom AB, http://home.swipnet.se/~w-78067/ERI2000.PDF, 2000

Bernard Ostle and Richard W. Mensing, Statistics in Research, Iowa State Univ. Press, 1975

Jon A. Wellner “Notes on Greenwood’s Variance Estimator for the Kaplan-Meier Estimator,” https://sites.stat.washington.edu/jaw/COURSES/580s/582/HO/Greenwood.pdf/, 2010

Filed Under: Articles, on Tools & Techniques, Progress in Field Reliability?

About Larry George

UCLA engineer and MBA, UC Berkeley Ph.D. in Industrial Engineering and Operations Research with minor in statistics. I taught for 11+ years, worked for Lawrence Livermore Lab for 11 years, and have worked in the real world solving problems ever since for anyone who asks. Employed by or contracted to Apple Computer, Applied Materials, Abbott Diagnostics, EPRI, Triad Systems (now http://www.epicor.com), and many others. Now working on survival analysis, epidemiology, and their applications: epidemics, randomized clinical trials, risk-based inspection, and DoE for risk equity.

« Pumping Abrasives With Progressive Cavity, Helical Rotor, Eccentric Screw Pumps
Equipment Failure Modes »

Comments

  1. Larry George says

    December 21, 2022 at 12:22 PM

    Thanks to Fred for rendering all the symbols and formulas in the article.
    Some mistakes were unintentional:
    “independent< Poisson increments" should have been "independent, Poisson increments"
    "SUM[a(s)]n(t-s), s=1,2,…,t]" should have been "SUM[a(s)*n(t-s), s=1,2,…,t]" two places.
    Sorry. MEGO
    If you want to original article *.docx, the Excel workbook, or the R-code used in this article, let pstlarry@yahoo.com know.

    Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Articles by Larry George
in the Progress in Field Reliability? article series

Join Accendo

Receive information and updates about articles and many other resources offered by Accendo Reliability by becoming a member.

It’s free and only takes a minute.

Join Today

Recent Articles

  • test
  • test
  • test
  • Your Most Important Business Equation
  • Your Suppliers Can Be a Risk to Your Project

© 2025 FMS Reliability · Privacy Policy · Terms of Service · Cookies Policy