Thursday, February 6, 2025

Why Proactive Maintenance Beats Reactive Every Time

Fail to plan is a plan to fail

What is the difference between proactive maintenance and reactive maintenance and why does it matter? Unexpected downtime is the great killer. It prevents you from meeting your goals. It prevents your factory from producing your products and impacts profit. A cooling tower or chiller breakdown means your hotel, conference center, or arena will likely have to compensate your guests for the poor experience of being trapped in a hot box.

Proactive Maintenance is Preventative Maintenance

So what exactly is proactive? Quite simply, it's any action you take to actively try and prevent any downtime or inefficiencies. Regularly oiling chains, greasing bears, changing filters, boiler blowdowns, cleaning evaporators and condensers, jetting drains, and changing belts are examples we should all be familiar with. You physically go out and either change something before it breaks or bring it back to optimal condition by performing some kind of cleaning. Doing these tasks alone is a great start, but you should be doing more.

The often overlooked and forgotten readings. Every facility will have equipment that will give a quantitative reading. Pressure on boilers and pumps, temperature and pressure of air on your compressors, amp draw of motors, voltage of batteries on emergency response system, and levels of chemicals in automatic feeders are all easily observable and recordable. These readings when tracked can help give you an insight into the life and health of the systems associated with them. Patterns will inevitably arise, and calls for action can be planned in advance when you notice readings straying from the norm.

Some newer or updated facilities will have sensors on some or most of this equipment that will alert you to a problem area. More advanced software will even bring up a map and show the exact location of an issue. This is a great start but often not enough. Regular physical checks will always show you more. A sensor is only going to show what it's programmed to show. An algorithm to examine the data is only as good as what it's programmed to do. It can't tell you the reason your compressor temperature is running higher than normal is because some birds decided to build a nest inside the unit. You won't find that till your yearly coil cleaning if you're not doing regular physical checks.

During certain events like inclement weather, certain sensors may go off. You go investigate and silence the alarm. Over time you stop physically checking the equipment during every storm and just silence it without a second thought. Then one day there is an actual problem. You've trained yourself to just silence the alarm, you just assume that it's the normal error that appears during bad weather even though it's a beautiful day. The issue goes on throughout the day, the next shift comes in and the alarm pops back up and silences it again. This goes on and on for days or weeks. Everyone has trained themselves that this particular sensor has always been an issue and always will, just ignore it. The part it was monitoring eventually breaks causing major downtime. Fixing the known problem sensor when you knew it was a problem could have helped prevent this, but so could physically going and checking to make sure it wasn't another false alarm.

While that was a worst-case hypothetical scenario, from personal experience of good and bad work practices in regards to taking readings, overreliance on software, and ignoring warning signs that have been staring you in the face.

Real-World Examples: The Good, The Bad, and The Ugly


The Good

When I was first learning how to operate a plant, I was lucky enough to be taught by a well-seasoned Navy plant veteran. Every hour I was walking around the plant doing readings and recording all the various data. It wasn't long before I saw patterns of what was normal.

A few months later, I noticed our daily makeup water in the chilled water system was outside the normal range. Sometimes this is expected. If the guys who were in the HVAC shop had done some major work, there was a good chance they had to purge a ton of water. I asked around and nothing from anyone. For the next several days, the increased water consumption continued. 

The plant manager asked me to go around the property and check all the refrigeration racks and air handles. There was a leak somewhere. Big enough for us to notice, but it had to be in a place with drains so it didn't cause damage. Someone would have noticed hundreds of gallons of water pouring out somewhere if that had not been the case.

It took a few hours, but I finally found a large air handler that normally cools one of the convention spaces on the property that had sprung a leak. It was inside the unit itself though. This wasn't going to be the easiest of repairs, but we caught it before it caused more damage. Luckily we were able to find a problem in an area that is only checked once a month. The problem also could have been much worse. If we hadn't been taking our readings and understanding the data, we may have missed that repair. If that leak continued to grow, it was possible for our makeup water to be too little to keep the system topped off. That would have been a worst-case scenario where we lose not only cooling but also all the fridges and freezes in the building that rely on the chilled water for their condensers.

This reactive repair was handled quickly since we were proactive in always assessing the building's mechanical health. We could have been more proactive as a team and been checking those units more often. Unfortunately, we were severely understaffed and spent most of the time like chickens with our heads cut off doing emergency work.

The Bad

I had just moved to a new state and had a job lined up at a company that owned a chain of apartments and hotels right across from the beach. During my first week, the maintenance manager wanted to test my skills and also try out new software he had purchased for a newly acquired high rise. This was plant automation software that would allow the boilers and chillers to be controlled remotely. 

Before I arrived he had shut down the equipment and set them from hand mode to auto on the equipment itself. The manager took me and another employee to the room and opened his laptop, the software displayed the equipment being off and ready for operation. Clicking the button, the screen's graphics change and then displays as running. He proudly says, "See that, now the chiller and pumps are all running. We can now control all this from the office." The problem, as I attempted to point out, nothing had actually turned on. Chillers are loud. This mechanical room was quiet. 

"What do you mean the chillers aren't running? It shows it right here on the computer. I've been installing these things when I worked for the power company since before you were born!" Yup, first week, pissed off my new boss. I point to the display on the chillers themself to point out they are in standby mode. The gauges show the temperatures matching on the incoming and outgoing lines. The electrical meters on everything show zero amps. Took an IR thermometer to further display the lack of temperature differential on the lines.

He wouldn't have it! "You're too new to teach me anything, the computer says everything is up and running. I don't care what you think you know." I left it at that, simply said alright you're the boss, sorry. While he may have been installing them, he never had any experience running the actual equipment and had an overreliance on software. I can only assume that it came from the power company having properly installed and operated software.

I didn't find out about the ramifications of his action, or inaction in this case till the next day. Even though it was fall, it was still a subtropical climate. Throughout the day and as people started returning to their apartments, the building temperature started to rise. The effects of having no cooling in the building made most residents call and complain to the building manager. He had to contact the CEO since it was such a large problem. My manager got chewed out and had to come in at 2am to switch everything into hand mode since the software seemingly wasn't working, as I had pointed out the day before.

Needless to say, this could have easily been prevented by doing a physical check. It seems ridiculous to me looking back, as we were physically in the plant when I pointed out that the software wasn't working. Use that thing between your ears. Never be complacent. Things happen and unexpected situations will always arise. Use all the tools and knowledge at your disposal.

The Ugly


A different factory I was working at, didn't really have a PM program. At least not a real one. Most PMs had consisted of cleaning one or two machines. The rest of the day was spent fixing things that either broke or needed to be adjusted. Most repairs were simple. At least a sensor, reflector or mounting hardware had to be replaced daily. Rollers and belts wore out constantly and also had to be repaired.

There was no monitoring of the air compressors, refrigeration systems, and boilers that kept the building running. This was mind-boggling to me since this was a culinary packing plant. Everything needed to be kept cold and all the lines required air pressure to operate. Me being me, decided to just check these things on my own.

I found one of the two air compressors was running very hot. Close to it's cut-off temperature. I let the boss know. Letting him be aware, I can diagnose and resolve the problem if allowed. He says not to worry and it's contracted out. A few months later that compressor trips out every day, it will only run for a few minutes and need to cool off for twice as long. The second one turns on, so no big deal.

Eventually, a contractor comes in. The unit that is working fine apparently had an issue that no one was aware of. They lock it out and disconnect it from the system. They also say nothing is wrong with the one that keeps tripping....this is why I don't like contracting out work.

This happened a few hours before my shift, while the plant was shut down. Not even 10 minutes into my shift, the compressor they said was fine overheats and shuts down. A half-hour later it's back up, just to overheat and shut down again. We can not operate. The plant manager is pissed. After explaining to him why we can't run and showing him the contractor notes, he tries calling my boss hoping he will get the contractor in. The boss won't answer.

Luckily this plant manager is level-headed. Asks me if I can fix it. Luckily I knew the issue was simple since I had been observing the unit for so long on my own. It only needed oil. It only took 30 minutes to drain the old oil and top it up with new. Luckily we had some lying around. I would have liked to change the filter, but we didn't have any, and every hour we were down it cost nearly 100k.

I wish I could say we never went with that contractor again, but they still got paid to fix something that wasn't broken and left us with a broken machine during production time. This was so ugly. No other mechanic or managers noticed the issue, no one questioned the contractor, and no one else knew how to work on air compressors.

Final Thoughts


You hopefully know a bit about ways things can go both right by being proactive and wrong without. No amount of proactive measures will prevent every problem. But they will help you stay ahead of them. Having a well-skilled team who cares also wouldn't hurt.

No comments:

Post a Comment

Followers

Why Industrial Mechanics Offer Great Job Opportunities

It's Fun Not many jobs allow you to 'play' with such a large variety of tools and machinery. Grinding and cutting a variety of m...