Workshop on the whiteboard

Symbolbild: Engineering/ F&E kritischer Systeme für funktionale Sicherheit (Safety) und Cybersicherheit (Security) 5 Tips for an Efficient FMEDA/ Quantitative FMEA

Time to Read 7 min

The FMEDA (Failure Modes Effects and Diagnostic Analysis) is a proven method for the analysis of fail-safe hardware circuits that is required by all current standards for functional safety. In some industries the method is also called quantitative FMEA (Failure Modes and Effects Analysis).

Below are the most important tips for effective and efficient use of this tool based on our experience in projects in the industrial (IEC 61508, ISO 13849), automotive (ISO 26262) and aerospace (DO-254, ARP4761) sectors.

What is the Difference between FMEDA vs. FMEA?

Failure Modes and Effects Analysis (FMEA)

An FMEA proceeds inductively, i.e. from the cause to the failure. The question here for each subsystem/component is: What safety-relevant "effects" can result from a failure?

E.g. if this component changes its value over time (i.e. it ages): how does this affect the function? E.g. if a state machine chokes by one bit: how does that affect the function?

There are two variants of the FMEA, a purely qualitative and a quantitative one. In the latter, failure probabilities for the different failure modes (short-circuit, open-circuit, drift, stuck-at...) of the components are added to the analysis.

Failure Modes, Effects and Diagnosis Analysis (FMEDA)

In the industrial and automotive environment a special case of a quantitative FMEA, the FMEDA (Failure Modes, Effects and Diagnosis Analysis) is usually done for the electronic part.

In an FMEDA the reduction in failure rates is directly imputed for diagnostic mechanisms (e.g. reading back output signals).

1. Start with the Basic Concept and do Not Get Lost in the Details

Safety analyses evaluate the status of the architecture/ design. They provide iterative optimization until sufficient functional safety is achieved. Do not miss the time window of opportunity at the beginning of the project lifecycle, when fundamental changes to the design can be made without much effort.

This means that the FMEDA starts long before the first schematics are ready. The first analyses start at a higher level of abstraction, for example based on the first sketches of a system block diagram. Already at this stage safety mechanisms can be designed that cover failures of entire blocks. Here, the standards offer appropriate assistance with proven architectures (e.g. diagnostic path or redundancy) and safety mechanisms (e.g. cross-comparison, CRC, ...).This can be done informally, during the design. For more complex projects it is better to perform a structured analysis of the functions of blocks at an early stage (e.g. as a system FMEA).

However, it is advisable to wait with the completion of the detailed fault analysis on the level of individual components (as required by the standards). If this is only created with the finished schematics, the effort for the subsequent maintenance of changes is saved.

We see the FMEA both as a design tool (left side of the V-Model: design/ synthesis of the function itself and its safety mechanisms) and as a verification tool (right side of the V-Model: quantitative evaluation of the reliability metrics). The FMEA also shows how important it is to structure the system properly and to select concepts that are as simple/ clear as possible (KISS - Keep it stupid simple).

2. Safety Arises from Discussion and Cooperation

Prior to structured documentation, an important aspect of a safety analysis is to network all involved experts. Some examples are:

  • Cross-domain expert teams: In complex systems, FMEDA workshops with experts from all involved disciplines prove their worth in competently evaluating how individual types of failures propagate throughout the system.
  • Developers of the safety functions and the diagnostic functions / safety mechanisms: Safety mechanisms are often implemented in software for the detection of hardware component faults. Here close cooperation of the hardware and software developer is required to specify and design the safety mechanisms and evaluate the achieved diagnostic coverage.
  • Author and reviewer of the FMEDA: The four-eyes-principle is a proven method to verify the analysis (e.g. that no faults are forgotten or to detect simple copy-paste errors). However, the effort required for communication to clarify review findings should not be underestimated.

A safety analysis is intended to create a deeper understanding of the circuit, even for aspects that are often not considered in the normal design process. This can only emerge in discussions involving all participants and is ultimately much more important than the tools or the achievement of metrics. We see the FMEA as a tool to structure the communication and create a common discussion basis.

3. Metrics are Superficial, but nevertheless Helpful

The standards define target metrics (PFH, SFF, DC, MTTFD, SPFM, PMHF, etc.) because they provide a simple and clear definition of the acceptable residual risk. However, each metric only measures exactly those aspects considered in the definition of the formula. Thus, metrics can be easily outsmarted. Therefore they can never replace an engineering expert judgement and common sense. Additionally, the quality of the base data (e.g. failure rates of components) is often limited.

However, during the design process, metrics help to identify and prioritize weaknesses of the design. The Pareto principle also applies to functional safety: the focus of the measures should be on the most important failure cases since this is where the highest risk reduction can be achieved. As rare failure cases/ corner cases must be considered as well, it means that a large proportion of the effort is spent with those.

Our experience has shown that with the detailed FMEDA the metrics are usually achieved without problems, without optimization in the decimal places, provided that the FMEDA was started early (as recommended above).

An absolutely precise fulfillment of the metrics often indicates that the analysis has been tampered with, so an auditor will take a closer look. The argumentation becomes difficult if no safety mechanisms exist for central parts of the safety function.

If the metric is narrowly missed, it is worthwhile to evaluate this in the overall system context. There is often room for maneuver, so this can be accepted, nevertheless.

4. Make clear What is Analyzed (and What is Not)

The FMEDA is necessary, but not sufficient to ensure the safety of a hardware design. An FMEDA analyzes the behavior of the circuit in case of a single component failure in the field, i.e. random hardware failures. Not covered are however e.g.:

  • Errors in the design of the circuit (systematic hardware errors, e.g. calculation errors in the design of a resistor)
  • Errors in software development (systematic software errors, e.g. overflow of a variable in fixed-point arithmetic)
  • Common cause failures (e.g. temperature, supply, clock, ...)
  • Multiple failures of components in the field
  • Errors during production

During the analysis it is recommended to collect failures on a parking lot, that are brought up, but do not belong to the scope of the FMEDA. These can then be dealt with in the corresponding analyses and design activities.

The analysis is always performed in relation to a specific safety goal. This defines which unsafe states must be avoided. It must also be specified which safe states should be reached in case of a failure. Also note that different safety goals are conceivable for the same circuit depending on the application context.

At the start of the analysis, the safety requirement and the scope of the faults analyzed should be defined in the report. These prerequisites have to be known to all participants, so that diverging assumptions are avoided. If an existing safety analysis is reused in a new project, it must first be checked whether these underlying assumptions still apply.

5. The Report is the Analysis

The FMDEA not only includes the table for the evaluation of each failure mode of each component. For later reproducibility, a report should be created that covers the following points:

  • Reference (incl. version) of the analyzed documents (such as schematics and specification of the safety function)
  • References to the standards used (e.g. for the determination of the component failure rate)
  • Summary of the approach
  • Team that created the analysis
  • Summary of the safety achieved

See also the corresponding checkpoints in the Software Safety Analysis blog post.

Consider the target audience: The FMEDA table is a tool of the expert team, it is used for evaluation, calculation of metrics and verification that nothing is forgotten. The report addresses the project management (e.g. safety managers), auditors and possible clients/ integrators who require proof and evidence that safety risks have been sufficiently mitigated.

What is your Experience?

What experiences have you made in your project? Please let me know in the comments...

We are here to support you in your project:

  • We consult your project team from the definition of the safety concept to the creation of the FMEDA
  • As FMEA/ FMEDA moderators we lead and document workshops with your experts
  • We create or verify the FMEA/ FMEDA for you, based on your schematics and safety requirements
  • We advise you on the definition of your FMEA/ FMEDA process and the creation of the corresponding templates

Benefit from our SolceptClinic and send me your specific questions about FMEDA. And the best thing is: this first time-boxed consultation of 30 minutes costs you nothing.

Luzian Hürlimann (original article), the page is maintained by Andreas Stucki

Do you have additional questions? Do you have a different opinion? If so, email me  or comment your thoughts below!

Would you like to benefit from our knowledge? Visit our contact page!

Author

Comments

No Comments

What is Your Opinion?

* These fields are required

Let us discuss your idea/ your project