Tuesday, February 28, 2012

Maintenance and Reliability Engineers: How Are They Different, and How Many Should I Have?

So another common question that manufacturing maintenance folks ask when designing a new reliability organization is about the Maintenance Engineer (ME) and how he is different from the Reliability Engineer (RE) and how many of each they should have. We could dive into the details of the accepted responsibilities of each but I have a nine page job role comparisons that include planners, REs, and MEs that provides that detail. If you would like a copy send me an email.
Let's look at it at a much higher level. One simplified way to think about it is an RE is focused on mean time between failure (MTBF) and an ME is focused on mean time to repair (MTTR). To say it differently the RE is focused on preventing failures from happening though statistical analysis of our history and the ME is focused on ensuring that if a repair is required it is as efficient and effective as possible so as to prevent a reoccurring failure or an excessively long  repair. Now I know we could argue the detail here but what I want is an image that we can then use to think about the number your site needs.
Our research says that you need one maintenance or reliability engineer for every thirty five crafts persons. The question now become how mature is your site? If you have great planners and a wonderfully refined job plan library and an engrained root cause culture you may find yourself in a world where you need more REs to complete advanced analysis of your existing data in order to further reducing your maintenance cost without decreasing reliability. On the other hand if you are like many of my friends in the industry you may be a bit reactive and have only recently added a planning department, have no work order history, and don’t use the principles of precision maintenance. If this is the case for you, then you may need more ME support to get the foundation in place before you are ready to add more REs.
This actually works quite well because many find that it is a good promotional transition to go from ME to RE to leadership roles if the engineer would like to have that promotional path. Not all engineers will want to do this but the ones that do will have that path available.
So to provide an example: If you have 100 maintenance crafts
  • Reactive maturity staffing could be 2 MEs and 1RE
  • As you mature the staffing might transition to 1 ME and 2 REs.
In the end the Maintenance Engineer will help you get you’re the basics in place. They help populate the CMMS and develop precision job plans working with the planner. The Reliability Engineer is focusing on the future which is important to really making the change to excellence in reliability.
That's my thoughts whats yours?

Wednesday, February 22, 2012

So is it a Planner, a Scheduler, or is it a Planner/Scheduler?

Recently, I had an interesting conversation about staffing the maintenance planner and scheduler roles with in a facility. The question was centered on whether a site should have the two disciplines split or if they should be combined. The answer in my mind is… it depends.
Here are my thoughts on criteria that affect the planner/scheduler organizational structure:
Size of the maintenance workforce:
If you are in a small facility it becomes very hard to support standalone schedulers. For that matter, it is hard to get one person dedicated solely to planning and scheduling. To put some numbers to it I would suggest that in an average maturity facility you will need one planner for every ten to fifteen crafts and one scheduler for every fifty crafts. The number of planners you need may drop as you mature and the job plan library is populated and refined but the scheduler will stay at a fixed level.
Reliability maturity of the facility:
More mature organizations can pool more of the crafts and share across multiple areas for maximum resource efficiency and utilization. If you share resources across multiple areas then that can be a good reason to have a standalone scheduler who devotes his time to working with all of the effected parties and creating a schedule that they can all support.
So if you consider both of these factors than it becomes a bit easier to design your organizational structure but you must keep in mind that as these size and maturity change your structure may have to change as well. 
What are your thoughts?

Tuesday, February 14, 2012

Five Why "Nots": 5 Reasons Why Reliability Engineers Should Use More Than 5 Whys for Root Cause Analysis

First of all let's talk about what Five Whys is before we mention what it is not. It is a problem solving tool used in many facilities and is commonly associated with Lean, Six Sigma and Kaizen implementations. The technique was originally developed by Sakichi Toyoda and was used within the Toyota Motor Corporation during the evolution of its manufacturing methodologies. The method is quite simple really and involves asking "why" multiple times until the individual believes that they have reached the process cause of the problem. This sometimes (but not always) means you will stop at the fifth "why" hence the name.
While I like five whys as an "on the floor" problem solving tool I cringe when people call it root cause for five reasons:
1. There really is no such thing as "root cause" a more correct phrase would be "root causes” because there are almost always conditions and actions that come together to manifest the failure. 5 whys as it is most often used only addresses one branch of the causal chain either the condition or the action.
2. By only following one causal chain you do not get the opportunity to analysis all the contributing causes and look for the lowest cost solution that eliminates or mitigates the risk to an acceptable level.
3. The results from 5 Whys are not repeatable. Different people using 5 Whys come up with different causes for the same problem. It is all based on their existing knowledge, and experience.
4. Five Whys many times is just used to prove what the practitioner already thought instead of looking at other possibilities. 5 Whys investigators are plagued by an inability to go beyond their current knowledge which leads to them not identifying causes that they do not already know.
5. My experience tells me that Engineers that use five whys solely do more investigations due to problem reoccurrence. They are sometimes celebrated for their ability to do 27 RCA investigations per month but when you look at the list 18 are problems that they should have taken the time to do a true analysis the first time and they would not be focusing on them again. Using the 5 why method causes a tendency for investigators to stop at symptoms rather than going on to lower-level root causes. This leads to reoccurrence of the problem.
So as I mentioned earlier Five Whys is a great tool for shop floor trouble shooting but if you are going to focus on root causes then you may want to consider the more advanced root cause analysis tools. I would suggest that root cause tools like logic tree or fault tree will identify more of the contributing factors and decreases the chances of failure reoccurrence. To see how to make a simple transition from 5 Whys to logic tree check out this single point lesson here. 
Good luck solving your problems and I hope you have fun doing so!

Tuesday, February 7, 2012

Just Because You Can Does Not Mean You Should: Six Ways to Evaluate Change

Six ways to evaluate a change in advance and increase the potential for success.
These six questions came out of the solution implementation phase of my Root Cause Analysis (RCA) process. However they would apply nicely to just about any change you are trying to make.  The point here is to think through the key success factors, ensure that this change is not just for change’s sake, and ensure the change provides the returns we expect over time.  The questions are broken into three basic areas, solution evaluation, risk of the change, and the communication process for the change. If we hit these three areas we can eliminate ninety percent of the issues that cripple most change efforts.
1.    What problem does the change solve? or prevent? In other words what is the true goal of the change? This is what we will test for with our metrics and KPIs so try to keep it measurable.
2.    What is the change worth? What will this provide as a return for our efforts? This becomes the financial driver for the project and should meet your internal standards for Return On Investment (ROI). The important point here is to evaluate the total cost of the change or life cycle cost as we call it and the total savings over that life cycle. Many companies are very good at capturing the savings but not the life cycle cost. This leads to poorly chosen solutions.
3.    Can the problem reoccur? This question helps us understand how effective our chosen solution is versus other options. We are identifying the residual risk and evaluating the options for change.
4.    What problems could it create? This question covers the unintended results of the change. It is a continuation of our risk mitigation planning step where we are making sure we have identified as many possible new failure modes as possible and evaluated their impact.
5.    Who is affected? We know that people are always involved in failures at some level so we must identify who needs to understand the changes to prevent these failures?
6.    What should they know? What training, communication, coaching, or documentation do they need? This combined with the previous step becomes the communication strategy for the change.
If you evaluate each the changes you plan to make with these criteria you are left with a simple goal statement with metrics for improvement, risk analysis, and communication plan that will help you to proactively mitigate issues with your change and truly evaluate the change prior to the expense of implementation has occurred. I hope these six make your life easier and your changes continue to be a success.

Wednesday, February 1, 2012

What I Learned from a Pharmaceutical Facility in Pennsylvania

This is a continuation of the "What I learned" series. 

Pharmaceuticals manufacturing is a very interesting business. For many years excess maintenance cost and maintenance down time has not been an area of prominent concern for many of these manufactures. But, with the need for additional capacity, the expiration of various block buster patents, and the rise of contract manufacturing their world is changing.
This facility was looking to be ahead of the curve and ensure they were ready for the changing environment. They wanted to be in charge of their destiny and not have it dictated to them.
The site showed me that there are many ways to succeed with reliability implementation and it can be called many different things.
This site used the energy behind the implementation of a new Enterprise Resource Planning (ERP) system and more specifically the Enterprise Asset Management (EAM) module as the framework for their reliability improvement efforts. As they implemented modules and sections they improved the processes and data that was required to be able to reach a new level of maintenance efficiency. This sounds like a logical choice but more often than not sites get over whelmed by this level of change all at once. Then the EAM implementation becomes simply a reimplementation of old practices, bad data, and ugly procedures in a shiny new software tool. It is a bit like adding a jet engine to a 1964 AMC Pacer. It sounds cool but it just does not end that well.
This site challenged the past standards and decided the old way was not good enough. They built a clear plan with managed subproject to prevent overloading of the organization.
They realized that the regulatory and product safety administrations were not there to put you out of business. Instead,  they are there to ensure you do what you said you were going to do. This site removed nonvalue added Preventive Maintenance (PM) task and refined procedures that once drove unreliability. They did all of this without increasing risk for the company or the customer. In fact, as they improved reliability with tools like Reliability Centered Maintenance (RCM) and Root Cause Analysis (RCA) in the later part of their ERP implementation they were able to provide more stable equipment and process which leads to a more quality product. Thanks to my time with this organization I know that when building a strategy for maintenance improvement it is very important to take a long hard look at the current state of the site, the culture, and the history of major initiatives and then prescribe an improvement plan that is specific to that site and it's culture.  As a consultant or a leader this means you can not always do it the same way you did your last project. You have to be constantly pushing the boundaries, matching the needs and using the culture to help them reach the goals they have set. This may change the title of your maintenance improvement initiative to a lean or Six Sigma project or it may be a part of an EAM or ERP implementation but in the end you use the site’s successful tools and momentum to drive the results that you need.