Pages

Tuesday, November 21, 2023

Warning major reliability roadblock ahead


 I want to warn you of a major issue that may plague your implementation of reliability improvement. It has to do with doing potentially the right thing for the wrong reason. Sounds harmless but it leads to cherry picking of fundamental elements while avoiding other critical pieces. It also leads to premature celebration and false confidence. Let me give you a few examples: if your maintenance supervisor or operations leader only embraces reliability topics like craft skills training, spare parts stocking, store room improvement, and the hiring of maintenance technicians then they may be doing so to improve their reactive maintenance strategy. Why is that bad? It prevents the transition to proactive maintenance and reliability where cost of maintenance can decrease, reliability can increase, and safety can excel with all of course driving higher profitability.  Let’s look at these three potentially good things a bit closer.  Why does this individual support craft skills training? Because they want quicker troubleshooting when the line fails. Why do they support more focus on the storeroom? Because they want faster parts at the window when reacting to a downtime event.  Why do they support having more maintenance people? Because they want one standing by every pieces of equipment to address any upset conditions immediately. As a bonus you may note they often ask questions like “how long to get it running?” and not “what can we do to prevent this from happening again?”

All of these are reactive, expensive, and regressive. You can never dig your way out of the hole. It’s a death spiral that leads to high production cost and general uncompetitiveness. It may even lead to the loss of your job.

In these locations you may also see that certain proactive drivers like populating the work order in the maintenance management system, or doing true root cause investigations, or really planning repairs, or completing thorough failure mechanisms based preventive maintenance strategy is not valued because it does not improve their reactive maintenance process. It becomes a world focused on Mean Time To Repair (MTTR) and not Mean Time Between Failures (MTBF). It’s a real dumpster fire. 

How do you fix it? You have to change the lenses that they use to look at the world.  It is not easy but you have to show them the proactive lens and not the reactive one. Show them a proactive universe where real job plans reduce infant morality, reduce required maintenance skill levels, and speed up repairs, in turn reducing required labor and reducing equipment downtime. Through this proactive lens they should see a place where consistently executed failure mechanisms based preventive maintenance limits unplanned breakdowns and emergency repairs reducing production risk. This is a true cultural shift and a huge step in the right direction in terms of your reliability improvement initiative.

You have to paint the picture and remind them to take off the reactive glasses and put on the proactive ones as they make decisions about how they address plant operations. You will need to do it many times until they can keep those proactive glasses on their face as their new perspective. You might even want to get real proactive and send them this blog.  Call me if you need me and good luck! 

Thursday, September 21, 2023

Confusion Around Leading and Lagging Metrics

 


The confusion with leading and lagging metrics stems from the fact that we cannot truly make a list of metrics that fall into each category. Some people think you can but it depends on your perspective. What is leading Key Performance Indicator (KPI) to one level of the organization can be lagging KPI to another. Thats right, many metrics or KPIs are actually both leading and lagging. As an example, in the operations and maintenance world, preventative maintenance (PM) completion (which measures the percent of scheduled PM task done when scheduled) is a good example of this confusing element. If you are an operations leader you might call it a leading indicator of equipment reliability. In other words, if I am doing my PMs on time then I should expect reliable equipment. ...But if I am a Maintenace Supervisor then it is a lagging metric or KPI that shows that we did the work that was scheduled. 

As you are selecting metrics another way you can look at them is by using "Behavior and Result." Does this metric drive a behavior, or does it measure a result? My opinion is if you want to drive behavioral change you want to measure things that drive that specific change. You may find that you measure them often maybe even daily or weekly whereas results type metrics you measure monthly or less often to make sure you are getting what you want from the process. 

What are your thoughts?

Friday, August 25, 2023

Some of the confusion around FMEAs (Failure Modes Effects Analysis)

Do you struggle with FMEA or FMECAs? Here are some thoughts that might help add clarity. If you disagree please reach out and lets discuss and learn together.

Function: In the context of FMEA, a function is the intended operation or performance of an item or system, typically defined by its performance standards or specifications. Essentially, it's the reason the item exists or what it is designed to do.

 


Example: In the case of a car brake system, one of its functions could be "to decelerate the vehicle to a stop when the brake pedal is applied".

 

Functional Failure: A functional failure is the inability of a system or component to perform its intended function. It is the state where the function of the system or product is lost or diminished.

 

Example: Following the above example, a functional failure of the car brake system could be "the vehicle does not decelerate to a stop when the brake pedal is applied".

 

Failure Mode: A failure mode is the specific way in which an asset or a system fails or ceases to perform its intended function. It's the end result or manifestation of a failure.

 

Example: A failure mode of the car brake system could be "brake fluid leakage".
Failure Mechanism: Failure mechanism is the "how" or the process that leads to the failure mode. It's the physical, chemical, or other processes that result in failure.

 

Example: A failure mechanism for the car brake system, leading to the failure mode of "brake fluid leakage", could be "corrosion of the brake lines", which over time weakens the lines and allows brake fluid to escape.

 
Functions are applied at the asset.

Supporting or secondary functions can be at the asset level but are typically at the subsystem level.

Failure modes, effects, and RPN are assigned to the component level. 

Risk Priority number would then be driven by:

functional failure would drive the majority of Severity and Occurrence scores and failure mechanism would drive the detection score

Failure mechanisms would represent the root causes at a physical or human level and thus the corrective action or inspection task should be driven by the failure mechanism.

 

Failure Effects are how the failure may make itself known to a Human.

 

In the case that a failure mode does not drive a failure effect, then there likely is not an associated functional failure.