August 13, 2020, 12:08 pm

Managing Human Error in Maintenance

Article Index

Human Error is Inevitable

Think of the traditional engineering approach to dealing with maintenance error, and most engineers tend to think along two lines - either discipline/counsel/train the individual(s) involved, and/or write a new procedure/work instruction to make sure that it doesn't happen again. Unfortunately, recent research and experience by Behavioural Psychologists indicates that neither of these approaches are likely to be successful in eliminating maintenance error.

Work by Reason and Hobbs(vi) explains why maintenance activities can be particularly error-provoking. In particular, it argues the futility of trying to change the human condition, when a more effective way of managing maintenance error is to treat errors as a normal, expected, and foreseeable aspect of maintenance work, and therefore, manage maintenance error by changing the conditions under which that work is carried out.

Reason and Hobbs identified a number of physiological and psychological factors which contribute to the inevitability of human error.  These include:

  • Differences between the capabilities of our long-term memory and our conscious workspace.  In particular, what we call “attention” is closely linked with the activities of the conscious workspace, and the conscious workspace has extremely limited capabilities including:
    • Attention is an extremely limited commodity – if it is drawn to one thing, then it is, by necessity, withdrawn from other competing concerns
    • These capacity limits give attention its selective properties – we can only attend to a very small proportion of the total available sensory data we receive
    • Unrelated matters can capture attention – such as preoccupation with other sensory or emotional demands
    • Attentional focus (concentration) is hard to maintain for any more than a few seconds
    • The ability to concentrate depends strongly on the intrinsic capability of the current object of attention
    • The more skilled or habitual our actions, the less attention they demand
    • Correct performance requires the right balance of attention, neither too much or too little.
  • The Vigilance Decrement – it is more common for inspectors to miss obvious faults the longer that they have been performing the inspection.  This is particularly the case when the number of “hits” is few and far between.
  • The impact of fatigue – this could be due to:
    • Time of day effects – our daily rhythms ensure that we are more likely to commit errors in the small hours of the morning
    • Stresses - physical, social, drugs, pace of work, personal factors
  • The level of arousal – too much or too little arousal impairs work performance
  • Biases in thinking and decision making.  There is no such thing as “common sense”.  In particular we are subject to:
    • Confirmation Bias – where we seek information that confirms our initial (and often incorrect) diagnosis of a problem
    • Emotional Decision Making – if a situation keeps frustrating us, then we tend to move into “aggressive” mode, but this often clouds our better judgement

As a result of these contributing factors, the types of errors that occur most often in Maintenance include:

  • Recognition failures – these include
    • Misidentification of objects, signals and messages, and
    • Non-detection of problem states
  • Memory failures – this includes:
    • Input failure – insufficient attention is paid to the to-be-remembered item.  This in turn can include:
      • Losing our place in a series of actions
      • The “time-gap” experience
    • Storage failure – remembered material decays or suffers interference.  Most common in maintenance is the problem of forgetting the intention to do something
    • Output failure – things we know cannot be recalled at the required time – the “what’s his name?” experience
    • Omissions following interruptions – we rejoin a sequence of actions having omitted certain required steps
    • Premature exits – we terminate a job before all the actions are complete
  • Skill-based slips.  Generally associated with “automatic” routines, these can include:
    • Branching errors – such as intending to drive to the golf course on a weekend, but missing the turnoff, and continuing on towards the office as you would every other day of the week
    • Overshoot errors – intending to stop at the shops on the way home, but forgetting and continuing home without stopping
  • Rule-based Mistakes.  Most maintenance work is highly proceduralised, and consist of many “rules”. These can be formally written, or exist only in peoples’ heads.  Typical rule-based errors include:
    • Misapplying a good rule – using a rule in a situation where it is not appropriate
    • Applying a bad rule – the rule may get the job done in certain situations, but can have unwanted consequences.  This is most common when people pick up others’ “bad habits”.
  • Knowledge-based errors.  Generally the situation when someone is performing an unusual task for the first time.  These need not necessarily be committed by inexperienced personnel.
  • Violations – deliberate acts which violate procedures.  These can be:
    • Routine violations – committed in order to avoid unnecessary effort, get the job done quickly, to demonstrate skill, or avoid what is seen as an unnecessarily laborious procedure
    • Thrill-seeking violations – often committed in order to avoid boredom, or win peer praise
    • Situational violations – those committed because it is not possible to get the job done if procedures are strictly adhered to.

Think of your own situation – have you never committed an error?  For most of us, the consequences of our past errors are relatively minor – but that is largely due to luck, and the situation that we were in at the time.  The traditional approach to dealing with human error – counselling and/or writing a procedure – cannot possibly effectively deal with all of the types of errors listed above.  We need a more holistic approach for managing maintenance error, and assuring Maintenance Quality.

Avoid Unnecessary “Preventive” Maintenance

Given the statistics mentioned earlier from Nowlan and Heap’s work, and others, it is clear that over-maintaining equipment not only is a waste of time and money, but it also increases the risk of safety and environmental incidents, as well as potentially causing expensive, and unnecessary failures.

Techniques based on the application of Reliability Centred Maintenance principles are an extremely effective way of weeding out this unnecessary maintenance, and streamlining and optimising equipment PM programs.

Our analysis of PM programs in place at our clients has indicated that in almost all organisations there is a huge amount of unnecessary routine maintenance being performed.  In some situations, fewer than 10% of the existing PM tasks were optimal, and it is not unusual for us to identify that as much as half of the routine maintenance activities were, at best, a complete waste of time. In many cases, the performance of some of these “preventive” maintenance activities were potentially causing equipment failures, rather than preventing them – particularly where these activities involved intrusive, fixed interval inspections and overhauls.  At one major offshore oil and gas platform in Western Australia, a comprehensive review of the Preventive Maintenance program led to a 25% reduction in the amount of routine PM being performed.  It also led to a 25% reduction in the amount of Corrective maintenance being performed.  Clearly, in this case, a fair proportion of the PM that had previously been performed was actually causing, rather than preventing, failures.

The starting point in eliminating unnecessary routine maintenance lies in ensuring that the need for all these routine maintenance tasks is defensibly justified.  This is the objective of Assetivity’s Rapid Equipment Strategy Development process. This process is based on RCM principles and has ten steps as outlined below.

  1. Determine Scope of Analysis
  2. Verify Equipment Capability
  3. Identify Failure Modes
  4. Analyse Failure Modes, Effects and Consequences
  5. Select Recommended Maintenance Tasks
  6. Identify Additional Improvement Tasks
  7. Consolidate Schedules and Integrate with Operational Strategies>
  8. Gain Approval and Implement Recommended Actions
  9. Track Success
  10. Beyond RCM and PMO

Detailed description of this process is beyond the scope of this paper. We would strongly suggest, however that, if you have not already done so, a critical review of your PM program is an essential first step to managing the impact of human error in maintenance.

Related Articles

Asset Performance Management (APM) – What is an Asset Performance Management system?
Over recent years, Assetivity has seen an increasing uptake of Asset Performance Management (APM) Systems in capital intensive industries.  We...
Enterprise Asset Management (EAM) and Asset Performance Management (APM) Systems - Making sense of your data
Can you make sense of your asset related data? Can you use this data to optimise your business? Can you connect data from the various asset related...
Availability vs Reliability – Which is more important?
There is often confusion amongst those new to Maintenance and Reliability regarding the difference between Availability and Reliability. This article...
Improving Equipment and People Productivity in the Mining Industry
Here is a copy of a presentation given by Sandy Dunn at the IMARC conference in September 2014.  In this presentation he talks about past...
Maintenance and Reliability Improvement Program
PanAust is a leading copper and gold producer in Southeast Asia and has a portfolio of pre-development projects in Laos, Chile and Papua New Guinea....
Reliability: Creating Competitive Advantage in a Cost-cutting Environment
Following a period of boom, the main challenge that Maintenance, Operations and Reliability leaders usually face is to survive the inevitable cost...