THE COMPANY
PROVEN EXCELLENCE
GUIDING MISSION
GUARANTEE OF COMMITMENT
OUR QUALITY MANAGEMENT SYSTEM
RELEX WORLDWIDE
CAREERS
DIRECTIONS
CONTACT US
RELEX RELIABILITY STUDIO
ENTERPRISE EDITION
DEMONSTRATIONS
LITERATURE
WHAT'S NEW IN STUDIO
SYSTEM ARCHITECTURE
CUSTOMER TESTIMONIALS
FAULT TREE/EVENT TREE
FMEA/FMECA
FRACAS CORRECTIVE ACTION
HUMAN FACTORS RISK ANALYSIS
LIFE CYCLE COST
MAINTAINABILITY PREDICTION
MARKOV
OPTIMIZATION AND SIMULATION
RELIABILITY BLOCK DIAGRAM
RELIABILITY PREDICTION
WEIBULL
PROFESSIONAL SERVICES
RELIABILITY CONSULTING
MTBF PREDICTION SERVICES
IMPLEMENTATION SERVICES
MAINTENANCE PLANS
RELEX UNIVERSITY
CUSTOMER SUPPORT
CUSTOMER TESTIMONIALS
RELEX CUSTOMERS
AEROSPACE
AUTOMOTIVE
DEFENSE
DIVERSIFIED
ELECTRONICS
INFORMATION TECHNOLOGY
MEDICAL DEVICES
OIL & GAS
RAILROAD SIGNAL
TELECOMMUNICATIONS
CASE STUDIES
NEWS & EVENTS
RELIABILITY eFLASH
QUARTERFLASH
PRESS RELEASES
TRADE SHOWS
SEMINARS
WEBINARS
WHAT'S NEW IN STUDIO
RELIABILITY RESOURCES
IMPORTANCE OF RELIABILITY
RELIABILITY 101
RELIABILITY DICTIONARY
RELIABILITY ARTICLES
RELATED WEB SITES
RECOMMENDED BOOKS
FAULT TREE/EVENT TREE/PRA
FMEA/FMECA
FRACAS CORRECTIVE ACTION
LIFE CYCLE COST
MAINTAINABILITY PREDICTION
MARKOV
OPTIMIZATION AND SIMULATION
RELIABILITY BLOCK DIAGRAM
RELIABILITY PREDICTION
WEIBULL
CONTACT US | REQUEST INFO
CUSTOMER CENTRAL
ID:  PASSWORD:
ABOUT US PRODUCTS SERVICES OUR CLIENTS WHAT'S NEW? RESOURCES DEMO
SEARCH 
Customer Case Study
ABOUT US
THE COMPANY
PROVEN EXCELLENCE
GUIDING MISSION
GUARANTEE
OF COMMITMENT
OUR QUALITY
MANAGEMENT SYSTEM
RELEX WORLDWIDE
CAREERS
DIRECTIONS
CONTACT US
PRODUCTS
RELEX RELIABILITY
STUDIO
FAULT TREE/EVENT TREE
FMEA/FMECA
FRACAS
CORRECTIVE ACTION
HUMAN FACTORS
RISK ANALYSIS
LIFE CYCLE COST
MAINTAINABILITY
PREDICTION
MARKOV
OPTIMIZATION
AND SIMULATION
RELIABILITY
BLOCK DIAGRAM
RELIABILITY PREDICTION
WEIBULL
ENTERPRISE EDITION
DEMONSTRATIONS
LITERATURE
WHAT'S NEW IN
STUDIO
SYSTEM ARCHITECTURE
CUSTOMER
TESTIMONIALS
PROFESSIONAL SERVICES
RELIABILITY CONSULTING
MTBF PREDICTION SERVICES
IMPLEMENTATION
SERVICES
MAINTENANCE PLANS
RELEX UNIVERSITY
CUSTOMER SUPPORT
CUSTOMER TESTIMONIALS
OUR CUSTOMERS
RELEX CUSTOMERS
AEROSPACE
AUTOMOTIVE
DEFENSE
DIVERSIFIED
ELECTRONICS
INFORMATION
TECHNOLOGY
MEDICAL DEVICES
OIL & GAS
RAILROAD SIGNAL
TELECOMMUNICATIONS
CASE STUDIES
WHAT'S NEW?
NEWS & EVENTS
RELIABILITY eFLASH
QUARTERFLASH
PRESS RELEASES
TRADE SHOWS
SEMINARS
WEBINARS
WHAT'S NEW IN
STUDIO
RESOURCES
RELIABILITY
RESOURCES
IMPORTANCE OF
RELIABILITY
RELIABILITY 101
FAULT TREE/
EVENT TREE/PRA
FMEA/FMECA
FRACAS
LIFE CYCLE COST
MAINTAINABILITY
PREDICTION
MARKOV
OPTIMIZATION
AND SIMULATION
RELIABILITY
BLOCK DIAGRAM
RELIABILITY PREDICTION
WEIBULL
RELIABILITY DICTIONARY
RELIABILITY ARTICLES
RELATED WEB SITES
RECOMMENDED BOOKS
SUPPORT
ONLINE SUPPORT
CUSTOMER CENTRAL
LOBBY
MY RELEX
ONLINE CUSTOMER SUPPORT
RELEX UNIVERSITY ONLINE
STUDIO DOWNLOAD
SERVICE PACKS
PARTS LIBRARIES UPDATES
HELP & DOCUMENTATION UPDATES
RETAIN SUPPORT SESSIONS
KNOWLEDGE BASE AND TIPS FROM TECH SUPPORT
SUGGESTION BOX
LOG OUT
SEARCH
FILE NOT FOUND
COPYRIGHT AND DISCLAIMER
PRIVACY
WEB SITE FEEDBACK
SITE MAP
Increasing Business Success in Today's Service Provider Market

Minimizing Service Outages Maximizes Profitability and Ensures Success

One of the most challenging elements in today's service provider market is to achieve the profit margin goals established as business objectives. To handle the ever-increasing pressure to achieve higher profit margins, service providers must look at increasing revenues and reducing costs in order to maximize their profits.

Ensuring that revenues remain high means that service outages must be kept to a minimum, since service outages directly translate into revenue losses. This can easily be seen in the following example.

A service provider bills an average of $.05 per call per minute. Assume that the provider suffers a service outage on a switch with 160 T1 lines during a period when switch usage would have been 75%. A single T1 circuit carries 24 voice channels, and 2 channels are required for a single voice call (one incoming, one outgoing). In this case, revenue loss per minute is equal to:

       Loss/minute = (160 T1 lines) * (24 channels/line) / (2 channels/call) * 0.75 * $0.05

When that switch is out of service, revenue loss occurs at the rate of $72.00 per minute. Table A demonstrates how quickly this loss grows.

Service Outage Revenue Loss
5 minutes $360.00
1 hour $4,320.00
2 hours $8,640.00
4 hours $17,280.00
Table A: Revenue Losses Due to Service Outages

Beyond the loss of revenue, other serious consequences of service outages may include: increased customer service costs, loss of customers, contractual penalties, and litigation fees. All of these will result in even further losses. Intangibles are also at stake, such as gaining a poor reputation in the market place and loss of future opportunities.

Service outages can be mitigated or prevented by selecting equipment that is high in quality. Defined by terms such as Carrier Grade, Carrier Class, or Telco Grade equipment in the telecommunications industry, equipment in these classes are generally accepted as assurance that the products are designed to meet quality objectives to maintain profitability. To understand how equipment achieves the Carrier Grade designation, the reliability and availability of the product must be analyzed.

Reliability

One of the best ways in which reliability can be expressed in a quantitative fashion is by studying equipment failure data and providing figures for MTBF (Mean Time Between Failures).1 Telcordia's Reliability Prediction Procedure TR-332 is a well-recognized method of predicting equipment MTBF figures for new product designs.2

Using TR-332, MTBF values were calculated for the major components of the EdgeIQ Intelligent Media Gateway manufactured by Versatel Networks, a leader in telecommunications equipment. Table B shows the MTBF values for each of the product components.

Component MTBF
Main CPU SBC (Single Board Computer) 150,000 hrs
Main CPU SBC I/O 500,000 hrs
Power supply unit 300,000 hrs
Fans 300,000 hrs
Backplane 1,000,000 hrs
VoIp i/f card 111,894 hrs
T1/E1 i/f card 85,692 hrs
Table B: MTBF Values of Versatel Network's EdgeIQ Intelligent Media Gateway Components

An MTBF value of 43,800 hours, or 5 years, does not indicate that the component will operate continually for 5 years, and then fail at hour 43,800. The MTBF value is a statistical value used to determine the probability or likelihood of failure. From MTBF, the probability that the component will fail in a certain time period is given by3:

       R(t) = e(-t/MTBF)

              Where
                     t is the period of time of interest
                     MTBF is the calculated MTBF
                     R(t) is the reliability function

An example calculation follows:

Given that the MTBF for the Main CPU SBC is 150,000 hours, what is the probability that this card will operate without failure for 5 years

Using the equation above:

       R(t) = e(-43,800/150,000) = 74.68%

The reliability, or probability of successful operation, varies depending on the time period in question. Table C shows the reliability at different times:

Time Reliability
2.5 years 86.42%
5 years 74.68%
10 years 55.77%
Table C: Reliability Values for Differing Time Intervals

Another value to consider is the probability that a component is operational after a period of time equal to the MTBF. This is a constant value that can be computed using the reliability equation:

       R(t) = e(-MTBF/MTBF) = e(-1) = 36.79%

This indicates that the probability that the Main CPU SBC will operate without failure within the MTBF period is 36.79%.

Logically speaking, when looking at individual components, higher MTBF values equate to better reliability. However, the effect of component MTBF values can have varying effects on overall system reliability. The overall MTBF of a system of n components is given by the equation:

MTBFsys = (1/MTBF1 + 1/MTBF2 + . + 1/MTBFn)-1

       Where
              MTBFsys is the total system MTBF
              MTBFn is the MTBF of component n

In general terms, the overall MTBF of the system is lowered each time a component is added to the system. However, the impact of additional components can have more significance in some situations.

Consider the following example:

From Table B, the SBC is made up of the Main CPU board and the I/O board. The total MTBF is:

MTBFSBC = (1/MTBFMain CPU + 1/MTBFI/O Board)-1

Or

MTBFSBC = (1/150,000 + 1/500,000)-1 = 115,385 hours

The probability that the SBC will fail after 5 years is 68.41%, versus 74.68% for the Main CPU by itself. This demonstrates that multiple components can have quite an effect on the overall reliability. To ensure that the overall system reliability is acceptable, manufacturers strive toward higher and higher MTBF values on individual system components.

Another common strategy employed for increasing overall system MTBF is through the use of redundancy. Redundancy is adding a second, or any number of additional duplicate components, to perform the same functions as the primary component so that if the primary component fails, one of the redundant components will take over. Industry standards have determined that MTBF is increased by 50% in a system with a primary component and a single redundant backup, commonly referred to as a hot/standby or '1+1' configuration. (This arrangement is also referred to as a 1-out-of-2 configuration, which means that of the 2 components, only 1 component is needed for operation.)

For example, in a redundant SBC configuration where each SBC has an MTBF of 115,385 hours, the redundant configuration would have an MTBF of 173,078 hours. This is a common strategy used for power supplies, fans, controllers, and many other items with high uptime requirements.

A redundant configuration with one SBC failure would be expected to continue operation for an extended period of time before a failure of the redundant SBC would render it inoperable. However, in reality, repair of the failed SBC component would be likely to occur before the second SBC failed. In situations like this, repair of failed components must also be taken into account when looking at overall system uptime. In these cases, the measurement of availability is often used.

Availability

While reliability is a measure of the percentage of time a system operates without failure, availability can be thought of as the percentage of time a system is available for use. Availability therefore takes into account failures and repairs of the system that contribute to non-operational time. Availability is usually expressed as a percentage, given by the equation:

Availability % = MTBFsys / (MTBFsys + MTTRsys) * 100

       Where
              MTBF is Mean Time Between Failures, as previously described
              MTTR is Mean Time To Repair

MTTR is a measure of serviceability that represents the mean downtime required to perform repairs and maintenance. This number should reflect all activities that affect the mean time required to restore service operation, such as dispatching service personnel, waiting for replacement parts, fault isolation, and replacing faulty components.

From analyzing the equation above, it is apparent that as long as the MTTR approaches 0, availability will approach 100%, regardless of the system MTBF. This is the reason engineers focus on technologies such as redundancy and hot-swap repairs (where a repair can be done "on-line" while a system is still operational), to maintain a high level of system availability.

A commonly used term associated with system of high availability or uptime is "five nines." Five nines refers to systems with an availability of 99.999%. Five nines is often considered the minimal acceptable availability for telecom systems. It translates to 5.26 minutes of downtime in a year.

Here are some examples of how Versatel Network's EdgeIQ Intelligent Media Gateway can help meet the five nines objective:

  • No single point of failure within an EdgeIQ Intelligent Media Gateway.
  • No impact on traffic when switching between redundant components.
  • Minimal use of shared resources.
  • All replaceable parts are hot-swappable.
  • Automatic component diagnostics or Built In Test Equipment (BITE).
  • Real-time diagnostic reporting.

Operation and Maintenance

Along with evaluating the important measurable values that impact your overall business goal of increased profitability, there are other factors to consider. For example, the ease of reporting failures to service personnel and tracking repair progress are critical for minimizing service response time, and therefore will ultimately impact system availability. In a similar way, accessible repair crews are also a factor that impacts availability. It is also important to track information so that personnel can become more knowledgeable at problem identification and resolution.

For a real world example of ensuring smooth operation and maintenance, the following list details some of the attributes of Versatel Network EdgeIQ Intelligent Media Gateway that allow for improved operations and maintenance to ensure high availability:

  • Easy installation of components.
  • Use of COTS (Commercial Off-The-Shelf) components instead of proprietary designs to benefit from vendor improvement in CPU price/performance/reliability.
  • Fully featured OAMP (Operations, Administration, Maintenance, and Provisioning) such as remote access, multi-user concurrent access, and an application program interface (API).
  • Extensive fault detection systems including the use of heart beat mechanisms, built-in tests, and diagnostics.
  • Flexible fault reporting.
  • Performance reporting.
  • Emergency technical support 24/7.

Physical

Lastly, actual physical system requirements are important elements in the telecommunication industry that must be taken into account to ensure the equipment operates reliably and safely in a central office environment, without adverse affect on the network operation. For this objective, service providers often rely on two key Telcordia documents that establish a set of physical/environmental and electromagnetic/electrical safety requirements that major service providers use as the key set of criteria the network equipment must meet:

  • GR-63-CORE "NEBS- Network Equipment Building System Generic Requirement"
  • GR-1089-CORE "Electromagnetic Compatibility & Electrical Safety - Generic Criteria for Network Telecommunications Equipment"

Meeting these standards must be a diligent effort. NEBS compliance must be taken into account from day one of design, and must be an established objective throughout the product lifecycle. Additionally, OEM components must be selected based on NEBS compliance. Testing beyond the NEBS requirements often will help to ensure that NEBS compliance is achieved.

For details on how Versatel Networks achieves NEBS compliance, please refer to Table D at the end of this document.

Summary

Service providers in the process of selecting a platform on which to build new and existing services should consider some key questions:

  • What elements of the system are redundant?
  • Are all critical operating elements covered by redundancy?
  • Are all critical operating elements repairable without powering down the system?
  • What are my options in selecting a redundant versus non-redundant system?
  • What are my options in converting a non-redundant system into a redundant system?
  • Are there service provisioning or scheduled maintenance tasks that require service outages in order to perform them?
  • Do software upgrades require a service outage?
  • What is the service impact when adding more features to the system?

Answers to these questions and the resulting decisions will impact the reliability and availability of equipment. By factoring in these important measurements, service providers will help to ensure key business objectives are met.

By selecting Carrier Grade, NEBS-compliant equipment, service providers can be assured that service outages are minimized. Versatel Network's EdgeIQ Intelligent Media Gateway platform is an example of equipment that meets these goals and ultimately maximizes profitability and ensures success.

Notes

  1. MTBF is the average time expected between failures of a repairable component over some specified time period. MTTF is the average time expected to the failure of a non-repairable component. However, to avoid confusion, MTBF is often used for both cases. This document adopts this convention.
  2. Telcordia was formerly known as Bellcore. Both terms are commonly used.
  3. This equation assumes an exponential distribution for the times of failures.

Component Certifications
IQ4000
IQ1500
NEBS: Level 3 tested per Telcordia SR-3580
EMC: FCC Part 15 Class A, EN55022 Class A, AS3548 Class A,
VCCI Class A, CNS13438
Safety: UL60950, CAN/CSA C22.2 No. 60950- 00, EN60950,
CSA C/US and CE Marks, AS3260
NEBS: Designed to NEBS Level 3 per Telcordia SR-3580
ETSI EN 300-019-2-1
ETSI EN 300-019-2-2
ETSI EN 300-019-2-3
EN 55022 Class B
EN 61000-6-2 (EN55024)
EN 60950 / (c)UL 60950
T1 Card FCC Part 68 and FCC 47 Part 15
CS03
IEC 60950
GR-1089 Class A and Class B
E1 Card European Regulatory Certification CE 2002
Radio and Telecom Terminal Equipment Directive 99/5/EEC
Low Voltage Directive 73/23/EEC
IEC 60950 1999 3rd edition
EN 60950 2000 3rd edition
Electromagnetic Compatibility Directive 89/336/EEC
Immunity: EN 55022 1998
Emisssion:EN 55024 1998 Class B, EN61000-3-2 1995,
EN61000-3-3 1995
Amending Directive 93/68/EEC
VoIP Card FCC 47 Part 15 Class B (emission)
Low Voltage Directive 73/23/EEC
. IEC 60950 1999 3rd edition
. EN 60950 2000 3rd edition
EN 55022 for CE mark (Emission)
EN 55024 for CE mark (Immunity)
CSA C22.2 no 950 for Canada & US (safety)
Table D: Versatel Network's NEBS Compliance Listing

About Relex Software Corporation

Relex Software Corporation is a world leader in reliability analysis software. Its products are used by thousands of engineers in a variety of businesses around the globe. In business since 1986, Relex Software Corporation produces a superior line of high-quality software tools for reliability analysis. Long-recognized for their user-friendly, state-of-the-art features, the modular tools in the Relex Reliability Software Suite include an intuitive graphical user interface, support for scientific graphical charts, an enhanced CAD interface, visual system modeling with redundancy support, completely customizable output reports, extensive parts libraries, and a comprehensive online help system. For more information on Relex Software Corporation, an IS0-9001 and TickIT 2000 certified company, call 724.836.8800 or visit www.relexsoftware.com.

© Copyright 2008 Relex Software Corporation REQUEST INFO | PRIVACY | FEEDBACK | SITEMAP
© Copyright 2008 Relex Software Corporation