November 18, 2018, 2:24 pm

Managing Human Error in Maintenance

Human-ErrorNumerous research studies have shown that over 50% of all equipment fails prematurely after maintenance work has been performed on it. In the most embarrassing cases, the maintenance work performed was intended to prevent the very failures that occurred. Building on the latest academic research, and based on practical experience, this paper outlines the key things that maintenance managers can do to reduce or eliminate the impact of human error in maintenance.

The key points that will be covered include:

  • Human error is inevitable - we ignore it at our peril
  • The role of an optimum PM program in minimising the impact of human error
  • Maintenance Quality Management - essential elements for managing maintenance error

Introduction

In their ground-breaking work that led to the establishment of the technique that we now know as Reliability Centred Maintenance, Nowlan and Heap(i) found, when analysing the failures of hundreds of mechanical, structural and electrical aircraft components, that these failures occurred with 6 distinct patterns, as illustrated below.

6 patterns

The interesting finding, in the context of this paper, is that more than two-thirds of all components demonstrated early-life failure.  It has been estimated that maintenance errors ranked second to only controlled flight into terrain accidents in causing onboard aircraft fatalities between 1982 and 1991 (despite the application of RCM techniques in the airline industry during this period).(ii)

A study of coal-fired power stations indicated that 56% of forced outages occur less than a week after a planned or maintenance shutdown.(iii)

Other studies have been conducted which confirm these findings, but, until recently, there has been little research performed that has investigated the reasons for this.  Several plausible theories have been proposed – possible explanations that I have heard include:

  • “Human Error” – the repair/replace task was not successfully completed due to a lack of knowledge or skill on the part of the person performing the repair.
  • “System Error” – the equipment was returned to service after a high-risk maintenance tasks without the repair having been properly inspected/tested.
  • “Design Error” – the capability of the component being replaced is too close to the performance expected of it, and therefore lower capability (quality) parts fail during periods of high performance demand.  The remaining higher capability (quality) parts are capable of withstanding all performance demands placed on it.  This could be envisaged in the following graph:

Design Error

  • “Parts Error” – the incorrect part or an inferior quality part has been supplied.

More recently, James Reason(iv) has compiled a table summarising the results of three surveys – two performed by the Institute of Nuclear Power Operations (INPO) in the USA, and one by the Central Research Institute for the Electrical Power Industry (CRIEPI) in Japan.  In all three of these studies, more than half of all identified performance problems were associated with maintenance, calibration and testing activities.  In comparison, on average only 16% of problems occurred while these power stations were operating under normal conditions.

Reason also quoted the results of a Boeing Study(v) which indicated that the top seven causes of inflight engine shutdowns (IFSDs) in Boeing aircraft were as follows:

  • Incomplete installation (33%)
  • Damaged on installation (14.5%)
  • Improper installation (11%)
  • Equipment not installed or missing (11%)
  • Foreign Object Damage (6.5%)
  • Improper fault isolation, inspection, test (6%)
  • Equipment not activated or deactivated (4%)

We can see from this, that only one of these causes was unrelated to maintenance activities, and that maintenance activities contributed to at least 80% of all IFSDs.

If poor quality maintenance causes so many incidents in highly regulated and hazardous industries such as Nuclear Power Generation and Civil Aviation, what proportion of failures may be being caused by Maintenance within your organisation?

What are the outcomes of maintenance-induced failures?  Clearly, depending on the industry in which you operate, there are potentially significant safety and environmental risks.  There is a long list of catastrophic failures in which, the inadequate performance of a maintenance task played a significant role.  Some of these include:

  • Flixborough
  • Three Mile Island
  • Piper Alpha
  • American Airlines Flight 191
  • Bhopal
  • Japan Airlines Flight 123
  • Clapham Junction
  • Etc. etc.

But besides the obvious safety risks, perhaps the bigger consequences are economic.  General Electric has estimated that each in-flight engine shutdown costs airlines in the region of US$500,000.  What could maintenance-induced failures be costing your organisation?

Clearly, we need to do something to reduce the number of equipment failures that are being caused, not prevented, by maintenance.  This paper suggests that the most appropriate approach is:

  • Admit that human error is inevitable (even in Maintenance!) and design our systems and processes around this inevitability
  • Use appropriate tools to ensure that we are not unnecessarily over-maintaining plant and equipment (and therefore increasing the risk associated with the fact that this work may not be performed correctly), and
  • Work to improve the quality with which maintenance activities are performed – including error-proofing where possible.

Related Articles

Technology Trends and Challenges for Asset Management
Written by: Sandy Dunn - Managing Director, Assetivity | November 5, 2018 Why we need to move beyond point solutions to a more strategic &...
Industry Lessons Learned in Maintenance Planning & Scheduling - Planning Pitfalls
This is the third of our three articles on Maintenance Planning and Scheduling – are we learning the industry lessons? Out first article explored...
Industry Lessons Learned in Maintenance Planning & Scheduling - Scheduling Pitfalls
In the first article in this series on Maintenance Planning and Scheduling, we explored some of the “Big Picture” issues associated with the...
Industry Lessons Learned in Maintenance Planning & Scheduling - “The Big Picture”
Most organisations that carry out equipment maintenance have some sort of Planning and Scheduling process in place, ranging from a basic manual card...
Four Essential Tools and Techniques for Improving Equipment Reliability
In earlier articles in this series, we have discussed the various functions within any business that can contribute to sound equipment reliability (...
Sign up to our Mailing List

Receive useful Maintenance & Asset Management articles, tools and news

assetivity logo