| Reliability Prediction Models: Use and Evaluation |
Part II: Reliability Prediction Standards
Note: This is the second part of a three-part article. It explores the various reliability
prediction standards that are available, including MIL-HDBK-217, Telcordia, NSWC-98/LE1, PRISM, RDF 2000,
CNET 93, HRD5, and GJB/z 299B. Part I
identifies the major factors in reliability prediction models that contribute to predicting component failure.
Part III provides guidelines for
making a sound judgment when deciding which standard to apply to a particular analysis.
This article explores the various reliability prediction standards available,
including MIL-HDBK-217, Telcordia, NSWC-98/LE1, PRISM, RDF 2000, CNET 93, HRD5, and GJB/z 299B.
Reliability Prediction Standards and Model Examples
MIL-HDBK-217
The earliest standard to appear was MIL-HDBK-217, Reliability Prediction of Electronic
Equipment, which was developed by the United States Department of Defense (DOD) in the 1960s. Since
then, MIL-HDBK-217 has been updated several times, with the most recent being Revision F Notice 2, released
in February 1995. Although the DOD has discontinued updates of MIL-HDBK-217, this standard is still widely
used in military and commercial applications throughout the world.
MIL-HDBK-217 Parts Count
MIL-HDBK-217 parts count prediction defines the overall equipment failure rate as:

EQUIP |
= | Equipment failure rate |
| n |
= | Number of generic part categories |
| Ni |
= | Quantity of i th generic part |
g |
= | Failure rate of i th generic part (table lookup) |
Q |
= | Quality factor of i th generic part |
If the equipment consists of parts operating in more than one environment, this equation is
applied to each portion of the equipment in a distinct environment. The overall equipment failure rate is
then obtained by summing the EQUIP for each environment.
MIL-HDBK-217 Parts Stress
For a MIL-HDBK-217 parts stress analysis, the models are much more detailed and varied across
part types. For example, the model for microcircuits, gate-logic arrays, and microprocessors is:
The model for a low-frequency diode is:

p |
= | Part failure rate |
| C1 |
= | Die complexity failure rate |
| C2 |
= | Package failure rate |
T |
= | Temperature factor |
E |
= | Environment factor |
Q |
= | Quality factor |
L |
= | Learning factor |
b |
= | Base failure rate |
S |
= | Electrical stress factor |
C |
= | Contact Construction factor |
A great deal must be known for each component in a design to determine the above factors.
Depending on component type, the model may include other multiplicative factors as well. Assembly and system
failure rates are simply the sum of the part-level and assembly-level failure rates, respectively.
Telcordia
The Telcordia standard, Reliability Prediction Procedure for Electronic Equipment, was
developed by Telcordia Technologies Inc. It originated from the Bellcore standard developed by AT&T Bell
Laboratories. Focusing on equipment for the telecommunications industry, Bell Labs modified the MIL-HDBK-217
equations to better fit their field data. The most recent version is Telcordia I, issued May 2001.
Failure Rates for Devices
The basis of the Telcordia math models for devices is referred to as the Black Box Technique.
This parts count method defines a black box steady-state failure rate,
BB ,
for different device types as:

g |
= | Generic steady-state failure rate for the particular device |
Q |
= | Quality factor |
S |
= | Electrical stress factor |
T |
= | Temperature factor |
The above inputs, contained in the Telcordia standard, were obtained from statistical data
collected over a number of years. For instances where temperature and electrical stress are unknown,
Telcordia recommends using a value of 1 for S and T , which assumes electrical stress to be 50% of the rated value and temperature to
be 40o C. The steady-state device failure rate, SS , considers an adjustment to the black box failure rate depending upon the
availability of laboratory and field data and device burn-in. For the simplest case where no data is
available, SS
= BB .
Failure Rates for Units
Units are comprised of numerous devices, and in adhering with the prediction paradigm, the failure
rate for a unit is generally the sum of the failure rates of all its contained devices. In the Telcordia standard,
a parts count steady-state failure rate for units, PC , is defined by:

SSi |
= | The steady-state device failure rate of device i |
E |
= | The unit's environmental factor |
| Ni |
= | The quantity of device type i |
| n |
= | The number of device types in the unit |
As with devices, the unit's steady-state failure rate, SS , considers an adjustment to this parts count failure rate depending
on the availability of laboratory and field data and device burn-in. For the simplest case where no data
is available, SS
= PC .
Failure Rate for the System
The system-level failure rate, SYS, is simply the sum of all failure
rates of the units contained in a system:

First-Year Multipliers
The first-year multiplier, FY , is defined in Telcordia at the device,
unit, and system levels as the ratio of the failure rate in the first year of operation to the steady-state
failure rate. Although it is not included in the steady-state failure rate calculations, the first-year
multiplier can be used to estimate the failure rate of these items during the first year of operation, or the
infant mortality period. For devices, the first-year multiplier calculation depends on burn-in time, device
stress, and burn-in temperature at the device, unit, and system levels. First-year multipliers for units
and the system are calculated as weighted averages of the first-year multipliers at the device and unit
levels, respectively.
PRISM
Developed in the 1990s by the Reliability Analysis Center (RAC) under contract with the U.S.
Air Force, the PRISM methodology offers some different approaches to reliability predictions. PRISM considers
the inherent failure mechanisms of the physical components comprising a system as well as other factors
affecting the reliability at a system or assembly level. These factors include processing variables,
empirical data from a predecessor system, and field or test data.
PRISM RACRates Models
In PRISM, the math models describing component failure rates are known as RACRates models.
These models are the foundation of the PRISM analysis. PRISM provides RACRates models for capacitors,
diodes, integrated circuits, resistors, thyristors, transistors, and software. The model for each of these
component types is different. For example, part failure rate, P , for capacitors in
failures per million calendar hours is:

OB |
= | Operating base failure rate |
EB |
= | Non-operating (or environmental) base failure rate |
TCB |
= | Temperature cycling base failure rate |
SJ |
= | Failure rate due to solder joints |
EOS |
= | Failure rate due to electrical overstress |
G |
= | Reliability growth factor |
C |
= | Capacitance value factor |
DCO |
= | Operating duty cycle factor |
TO |
= | Operating temperature acceleration factor |
S |
= | Operating electrical stress factor |
SR |
= | Operating series resistance factor |
DCN |
= | Non-operating duty cycle factor |
TE |
= | Non-operating temperature environmental factor |
CR |
= | Temperature cycling rate acceleration factor |
DT |
= | Temperature cycling ΔT acceleration factor |
For components not having RACRates models, PRISM provides Non-electronic Parts Reliability
and Electronic Parts Reliability Databooks (NPRD-95/EPRD-97). A multitude of part types can be found in
these databooks with failure rates for various environments. However, the additional features explained
below are not applicable to NPRD/EPRD parts.
PRISM Process Grades
The general PRISM failure rate of a system, PSYS , is:

| PG |
= | Process grade |
P |
= | RACRates failure rate of the i th component |
SW |
= | RACRates failure rate of software |
The process grade, PG, is a multiplicative scoring factor generated from a series of
questions that specifically apply to the system under analysis. The general principle behind process grades
is to adjust the predicted failure rate by accounting for other factors which can contribute to reliability
such as the manufacturing process and design engineer experience, which can be difficult to quantify. Two
process grade models are available: the Inherent model and the Logistics model. The Inherent model considers
only those processes that relate to system design and build. The Logistics model additionally takes into
account external conditions affecting reliability in the field. Generally, process grades may consider part
quality, infant mortality, environmental conditions, design techniques, reliability growth, manufacturing
practices, system management, induced failure from external stresses, defects that cannot be duplicated, and
wearout.
Predecessor Analysis
Predecessor analysis is used to capture the evolutionary nature of design. It allows you to
use empirical data collected for a predecessor product that is similar to an assembly or the entire system
under scrutiny. Predecessor analysis produces a combined failure rate based on the predicted failure rate
of the current and predecessor products, and the field failure rate of the predecessor product.
Bayesian Analysis
Bayesian analysis is used in PRISM as a method to integrate test or field data into the
assembly-level or system-level calculations. The data is used to optimize the reliability prediction
accounting for variables from a real system that cannot be included in models. The empirical data is
weighted depending on the size of the dataset; thus, the larger the dataset, the more it contributes
to the failure rate.
RDF 2000
RDF 2000 is a French Telecom standard that was developed by the Union Technique de
l'Ectricite (UTE). The previous version of RDF 2000 is referred to as CNET 93. The most recent release
was in July 2000. RDF 2000 provides a unique approach to failure rate predictions in that it does not
provide a parts count prediction. Rather component failure is defined in terms of an empirical expression
containing a base failure rate multiplied by factors influenced by mission profiles. These mission profiles
contain information about operational cycling and thermal variations during various "working" phases. For
example, component failure rate, , for Class I ceramic capacitors with y operational cycles and
j thermal variations is:

( t )i |
= | i th temperature factor related to the i
th junction temperature of the capacitor mission profile. This factor is determined
by an Arrhenius equation with a specified activation energy. It is a function of ambient temperature
during the i th working phase. |
i |
= | i th working time ratio of the capacitor
for the i th junction temperature of the mission profile. This is the ratio of time
per year that the capacitor is 'on' at a particular junction temperature. |
on |
= | Total working time ratio of the capacitor |
 |
|
off |
= | Time ratio for the capacitor being in
storage (dormant) mode |
 |
( n )i |
= | i th influence factor related to the annual
cycle's number of thermal variations seen by the package, with the amplitude
ΔTi . This factor is an empirical expression based on the number of cycles
per year, ni , with thermal variation having the given thermal amplitude. |
| ΔTi |
= | i th thermal amplitude variation of the mission
profile. |
Similar models exist for many component types in the RDF 2000 standard. The failure rate of the
system is determined by summing all of the component failure rates.
Other Standards
NSWC-98/LE1 The Handbook of Reliability Prediction Procedures for Mechanical
Equipment, developed by the Naval Surface Warfare Center, contains reliability models for mechanical
devices such as springs, bearings, seals, motors, brakes, etc. The most recent version of this standard
was released in September 1998.
HRD5 The Handbook of Reliability Data, published by British Telecom of the
United Kingdom, is based on the former RDF 2000 standard, CNET 93. The most recent version was released in
1994.
GJB/z 299B The Reliability Calculation Model for Electronic Equipment
is a Chinese standard translated into English by Beijing Yuntong Forever Sci.-Tech. Co. Ltd in May 2001.
This standard was developed for the Chinese military, and it is very similar to MIL-HDBK-217. It includes
both a parts count and parts stress prediction analysis methodology.
Relex Reliability Prediction
Relex Reliability Prediction can use any of the standards described in this document to assess
the reliability of your system design. Relex's superior integration enables you to mix calculation models
within a single project, allowing you to select the model most appropriate to each part or component in your
design. A comprehensive matrix of the model coverage for the four most widely used standards can be found at
www.relex.com/resources/art/art_predmodels.asp.
With Relex Reliability Prediction, the additional analysis capabilities available in one model can typically
be applied to any model you use. For example, Telcordia calculation methods for taking field and test data
into account in the prediction can also be used with MIL-HDBK-217. For additional product information, refer to
www.relex.com/products/relpredsoft.asp.
|