An example that I continue to see causing more issues than it is solving, is the use of the KPI known as Mean Time To Repair (MTTR). When this metric is used alone, in immature organizations, or without an understanding of the unintentional consequences it can drive your organization in the opposite direction of world class performance. If your organization is immature from a reliability cultural standpoint and you choose MTTR as your focus then you set yourself up to become very reactive by being very quick to respond to failures. The facts are:
- Reactive response is at least 5 times more expensive than planned and scheduled work .
- Operations will beat on you to get faster and faster at responding so that you lower MTTR.
- Rushed repairs are less reliable.
- Reactive response requires more expensive spare parts stock.
- Repetitive failures and repairs increase the chances of the introduction of infant mortality failures.
- You will find yourself with high skilled maintenance technicians just standing on the manufacturing floor doing nothing while waiting on a failure to occur.
- Pressure to make the repair as quickly as possible can lead to taking elevated safety risks either intentional or unintentional.
Picture it this way: If MTTR is all you have then your organization will create tools like crash carts and quick response teams instead of using tools like Root Cause Analysis (RCA) to understand and eliminate the reoccurring problem. From the real world, I have seen bearing quick change carts developed to speed up re-occurring failures repairs where it they had just tensioned the belts properly the failures would have been eliminated. Five really fast 2 hour bearing replacements is still much worse that bearings that don't need repairs at all. This site needed to understand better not respond quicker.
Are your metrics driving the wrong behaviors? Are you using them at the right time?
Tell us what metrics have not worked for you and why in the comments below.
I would add that MTTR doesn't really address the variability of even a normal job. Variability is part of the normal process, sometimes one bolt will take longer to remove. Ignoring that variability using a simple oversimplification will lead to the unintended consequences you have listed. Instead use the lognormal distribution to describe the distribution of repair times - much more informative.
ReplyDelete