In this article, we will explore four key ways in which Maintainers and Maintenance Management can impact on equipment reliability, and offer some practical suggestions for how they may be able to assure, and improve reliability through better maintenance. This is the fourth article in a series of eight articles on Reliability Improvement.

1. Ensure that the right preventive maintenance program is in place.

The first key area where Maintenance can impact on equipment reliability is by having an appropriate preventive and predictive maintenance program in place.

To be effective in preventing unexpected failures, the preventive maintenance program needs to, from a technical perspective, successfully deal with the underlying causes of failure (which, for the purposes of this article, we will call failure modes). However, it also needs to successfully deal with the consequences of failure. These consequences could be safety, environmental or economic (e.g. lost production or additional failure costs) in nature.

Reliability Centered Maintenance (RCM) or its closely-related cousin, PM Optimisation both identify, and deal with, both the technical aspects and consequence-related aspects of specific failure modes, and apply structured decision-making logic to determine a sound preventive and predictive maintenance program. However, there are also a wide range of other approaches that could be taken – some more robust (and more time-consuming) than others. We discuss some of these approaches in our article “Alternative approaches for developing and optimising Preventive Maintenance”.

It is also worth noting that one of the limitations (but also one of the features that makes it comparatively simple to apply, once you understand the concepts) of RCM is that it only considers one failure mode at a time (apart from when considering “hidden” failures) and the general principle is that there should, ideally, be a single preventive task per failure mode and (if it is a condition-based inspection), that a single inspection technique should be used, at a frequency that is determined by the PF Interval. However, with the ever-increasing pervasiveness of real-time asset health monitoring via a range of on-line sensors, and the increasing capabilities of sophisticated big-data mining and predictive analytics technologies, these principles will not always be appropriate. We discuss this further also in our related articles “The PF Interval – Is it Relevant in the world of Big Data?” and “Big Data, Predictive Analytics and Maintenance”.

It is also worth noting that, if you are not careful, inappropriate preventive maintenance can actually reduce equipment reliability, rather than improve it. Nowlan and Heap’s seminal research in the aviation industry identified that equipment can fail in accordance with one of 6 failure patterns, as illustrated below.

6 failure patterns

Consider those failures that comply with Failure Pattern F in the above diagram. Performing routine preventive maintenance such as a scheduled overhaul or replacement of those items would actually reintroduce early life failures into what was previously a stable system, and would, therefore, reduce the reliability of the system. Nowlan and Heap’s research indicated that, at the time of their research in the 1960’s in the aviation industry, 68% of failure modes displayed this failure pattern. While the exact proportion of failure modes in your industry may be different to this, the general rule still applies – be careful when considering performing any routine “preventive” maintenance, if the act of performing this maintenance could potentially introduce new causes of failure. This is particularly the case when the preventive maintenance task is intrusive, or complex, or spans multiple shifts.

We will discuss how to deal with maintenance-induced failures such as this later in this article.

2. Make sure that the preventive maintenance program is delivered on time and in full.

Of course, the best preventive maintenance program in the world is of little use if it is not actually delivered by the Maintenance execution team.

To avoid this happening, preventive maintenance tasks must be the number one priority for the maintenance execution team, and should, ideally, take precedence over all but the most important breakdown tasks. This is easy to say, but harder to do when the plant is unreliable and operations personnel are screaming in your ear to get the plant back up and running again. As the saying goes, “when you are up to your backside in alligators, it is hard to remember that you are there to drain the swamp”.

To avoid continually deferring (or even cancelling) routine Preventive Maintenance tasks, here are a few tips:

  • Make sure that you have a robust Maintenance Planning and Scheduling process in place; one which makes sure that all routine PMs are scheduled for completion at the right time, which ensures that any parts, tools, equipment or labour required in order to perform the PM are available at the schedule time, and which treats PMs as the highest priority work.
  • Make sure that the Operations side of your business understands and appreciates the value of getting PM tasks done on time – or alternatively, fully understands the risks of not getting it done on time.
  • Establish a formal process for reviewing and approving all “break-in” work (work which was not originally scheduled to be done in the current scheduled week, but which is requested to be done in the current week) prior to it being performed. Break-in work should be subjected to due scrutiny, screening and priority setting (by both Maintenance and Operations management) in order to ensure that only truly urgent work should be performed as “break-in” work. It goes without saying that work orders should be raised for all break-in work in order to ensure that jobs cannot bypass this process.
  • Monitor and report overdue PMs (and the amount of time that they are overdue) as a Key Performance Indicator. The reasons for PMs being overdue should be investigated, and appropriate action taken to minimise the chances of recurrence.

3. Assure maintenance quality.

A maintenance activity, whether preventive or corrective, is only as good the quality with which the task is performed. The quality of maintenance work can be impacted by a number of factors, including:

  • The skill of the person performing the work
  • The quality of the direction/instructions given to the person performing the work
  • The availability of the correct tools, materials and other resources required to perform the work
  • Work arrangements and their impact on human factors, such as fatigue
  • The level of attention paid to the task by the person performing it

It should be noted that even the most highly skilled people can make errors on occasions. There are numerous physiological and psychological characteristics that contribute to human error and these are held by all people, not just an error-prone few.

We discussed Maintenance Error in more detail in our article “Managing Human Error in Maintenance” , but it is worth recapping some of the key elements here.

Maintenance error is more prevalent than many of us would like to believe – even in those industries where attention to detail in performing maintenance tasks is highly critical, and where the consequences of Maintenance error can be potentially fatal. For example, a Boeing study indicated that 80% of in-service engine shutdowns on their civil aviation aircraft were related to some form of maintenance error.

The types of actions that organisations can take to minimise the chances of maintenance error were detailed in Reason and Hobbs excellent book “Managing Maintenance Error” under the headings of:

Person measures

  • Provide training in error-provoking factors
  • Implement measures to reduce the number of deliberate violations.
  • Encourage mental rehearsal of tasks before they are performed.
  • Control Distractions.
  • Avoid Place-Losing Errors through such techniques as inserting place-markers at appropriate points in the procedure.

Team measures

  • Provide teamwork training to ensure sound communications between individuals and between teams.

Workplace and task measures

  • Ensure that personnel only perform tasks when they are properly trained, skilled and qualified.
  • Proactively manage fatigue
  • Assign tasks appropriately to manage the risks associated with both infrequently performed and very frequently performed tasks – these tend to be those at greatest risk of human error.
  • Ensure that equipment, and tasks, are properly designed for maintainability.
  • Enforce good housekeeping standards.
  • Ensure Spare Parts and Tools are managed well.
  • Write, and use, effective Maintenance Work Instructions.

Organisational measures

  • Put in place effective processes for analysing, and learning from, past failures.
  • Encourage a “Reporting Culture” where all failures, no matter how seemingly insignificant, are reported.
  • Put in place proactive processes for assessing the risk (likelihood and consequence) of future maintenance errors.

Ultimately, even putting both proactive and reactive measures in place will not guarantee the absence of human error, but together, these strengthen the organisation’s intrinsic resistance to human error.

4. Encourage maintenance precision.

In the previous article in this series Reliability in Operations, we outlined the Ledet/Dupont model for operating excellence, as shown below

performance measures

In that article, we outlined some of the things that operators can do in the Proactive domain to improve precision and, as a result, improve equipment reliability. There are equally many things that the maintenance function can do in this area. However, fundamental to achieving this is the requirement for organisations to formally state, and document, their quality standards when it comes to precision maintenance. Without this, maintainers will apply their own personal standards which may, or may not, be adequate to ensure a high level of quality and precision in the performance of maintenance activities.

Some of the ways in which maintenance can ensure a high level of precision in the performance of maintenance activities, and therefore improve equipment reliability include such items as:

  • Specifying acceptable tolerances (and the methods for achieving them) for alignment of rotating equipment.
  • Specifying acceptable tolerances (and the methods for achieving them) for balancing rotating equipment.
  • Ensuring that baseline vibration analysis is undertaken after any work on rotating equipment to check that work has been performed with acceptable precision.
  • Specifying cleanliness standards for all lubricants and the methods for ensuring that these cleanliness standards are achieved.
  • Specifying and enforcing lubrication practices that minimise the chances of lubricant contamination.
  • Specifying the torque required when fitting or tightening critical fasteners, and the tools and methods to be used when fitting these fasteners.
  • Specifying and enforcing the methods for ensuring accurate fits and tolerances when equipment is running at normal operating temperatures

There is strong evidence that there is a high level of correlation between the adoption of precision maintenance standards and practices and lower maintenance costs, particularly for rotating equipment. By extension, it is logical that there would also be associated improvement in equipment reliability.


If you would like to receive early notification of publication of future articles,sign up for our newsletter at the top left of this page now. In the meantime, if you would like assistance in establishing effective reliability in operations within your organisation, please contact me; I would be delighted to try to assist you.

Back to top