welding

Dealing with breakdowns

You’ve been called at 2AM in the morning. The plant is at a standstill and the seconds are ticking away in ten dollar increments. You drive to work, approach the line and take a deep breath. It’s time to prove your worth. Performing under pressure is difficult but I’ve found routine and documentation makes this much easier. Because all machines are different, it’s hard to have a universal problem solving checklist but in this article, I’ve distilled out the overall themes that can potentially help any breakdown. So here we go.

Gather information

Get your notes and tools together

Better to get all your gear first rather than keep running back and forth for bits and pieces. I favour a laptop or tablet over paper because it makes this easier. Get the machine manual, wiring diagram and your notes all together. Hotline numbers and machine serial details, too if you have online support. If you think you may need them, grab the other bits and pieces… strobe, Flir camera, tachometer, etc.

Record pertinent details as it is

If there’s any errors on the display, either take photos or note them down now. One of the first things you’re likely to try is turning it off and on again at which point it may just start working again which is great- but now you don’t know what happened or if it will happen again.

Look up the error codes in your manuals if they aren’t self explanatory and cross reference them to the electrical diagrams if possible. Knowing error code 345 is not meaningful, knowing it’s a jam sensor is somewhat better, knowing it’s that second conveyor jam sensor over there almost has the problem solved.

If there’s any hint that there could be a material issue, note down the batch number or other such details. If you have a quality problem, have a look at the previous quality records if they are there.

Ask the operators targeted questions.

I often don’t get anything better than “it stopped working” when I first get there. I need to ask things like “what were you doing when it stopped,” “was this a mid-run issue or did this issue happen after a new setup,” “what was the previous setup you ran,” etc. “Has this happened before and what was done last time” can be a good one. These questions can take you from “machine no worky” to “it was fine but then we changed to a thicker material and immediately it started jamming here. Last time this happened, we adjusted that roller.”

See for yourself

Get the machine started and replicate the problem so you can see it for yourself. Don’t go too far down the path of figuring it out yet. Just see it happen.

The calm before the storm

At this point, you should have some information to chew on and a group of people to call on waiting for something to do. Busy them up.

Call for help

If you have a support line – many modern machines do – call that now because they usually don’t call back for an hour or so. Call an electrician if you aren’t one and you have the slightest hint that this may be an electrical issue.

Quick cleanup

Get the operators to clean the critical parts of the machine that always give problems when they are dirty. Things like optical sensors and reflectors, web rollers and so on.

Start hunting

Machines vary a lot so it’s hard to give advice that’s useful to anyone with different equipment than I’m experienced with but some things are universal. It’s going to get mucky and nonspecific at this point but here’s some tips.

Start with the simplest thing first

That doesn’t necessarily mean the politest thing. I once had an intermittent issue on a flexographic printer. During a planned maintenance, I opened a door to check some things, closed it and left to do something else. I passed the machine again a while later and thought it was odd that it had not been started up again so I went to check and found my electrician removing the safety interlock to swap it for a spare. He said it stopped working when I opened the door. I then went and prodded the operators to reset the safety circuits (“no, not just that one, all of them”) and of course we were up and running.

The problem with simple things is they are just so simple that nobody could possibly overlook them. You think that, they think that, everyone thinks that and that’s how they slip under the radar. Sometimes you have to be the guy that appears either a bit silly or a bit condescending for checking something so obvious. This anecdote overlaps with another point- Verify. Having a good relationship with the machine operators and asking questions is super important but verify everything that you can.

Slow it down

If you’re troubleshooting something that’s moving, slow that movement right down so you don’t miss anything. Most mobile phones these days have a slow motion video feature which is excellent for observing things too fast for the naked eye but not quite fast enough to use a strobe.

There are also occasions where a slo-mo video will show something that a strobe won’t. I once had an issue on a high speed conveyor where paper bags were thrown onto it by a previous process and they were supposed to overlap like roof shingles. They were jamming occasionally. The problem was they were sliding on the receiving conveyor and so occasionally collided and jammed. A strobe light didn’t show this but the high speed camera did.

Start from the start

If you have an issue with a multi step process, confirm each step is working correctly from the beginning and work downward. I used to see this quite often where a machine was failing to break a paper web with perforated cuts without tearing. Much adjustment had been done to the breaker but actually the perforations had not been properly cut. The problem was with the perforator about 5 metres upstream of the breaker. Another one I saw that took a lot of time to sort out was bags being made with the bottom folded closed crooked. The problem was not the closing part of the machine but actually the opener, about 10 metres upstream was kicking them crooked occasionally because it was running slightly out of phase to where it should be. Which overlaps with the next point.

Reset

Some machines- particularly the old ones- are tricky to set up. And often you are using a machine in a way that covers about 20% of its capability, which means that you and the operators probably only have a good understanding of how to do 20% of the machine setup- which is fine until a particularly bad jam or something causes the setup to be out of whack somewhere outside that 20%.

At this point, if you’re having trouble and there’s no smoking gun, it may be time to get out the manual and do a full setup.

Overloads

Motor overloads are a very common fault and it’s typically not the motor itself that’s the problem- and it’s almost definitely not that the overload protector current needs to be turned up a bit. Monitor the motor current during normal operation. Consistently high current could imply a failing bearing, mistracking belt or some sort of mechanical interference. I’ve seen some less common faults cause intermittent high current such as a faulty contactor controlling the motor brake and fatigued cables in energy chains that have shorts at a particular point in travel.

Safety circuits

I’ll have to do a full article on this later but a safety circuit not resetting is not something that should have you guessing. Safety controllers typically have indicator lights on the front that show if the problem is the safety input #1, #2 or the feedback/reset circuit. Once you know which one, you can then start checking continuity from one end of the circuit to the other (if your wiring diagram exists and is accurate). Your electrician should know this but panic and confusion can result the more methodical approach going out the window.

After the breakdown

Write it all down.

If you don’t, you’re troubleshooting the problem for the first time, every time. After you have a problem, think about what you checked, what order you checked it in, what order you should have checked it in and what checks can be done simultaneously. Write it all down. This is now your troubleshooting procedure which you can share with your team so you don’t have to micromanage the next event.

I once had an issue with a polyethylene web drive motor intermittently overloading. We had checked the bearings, gears, belts, etc in the drive system multiple times, replaced the motor with a spare (which also overloaded) and otherwise inspected the mechanical system up, down and sideways. There was nothing at all wrong with it. There had been no process changes and the machine was cleaned as it always had been. We tried different batches of product to no avail and instructed operations to keep the drive rollers clean. I cleaned the roller myself to make sure. Nothing worked but after limping along at slow speed for a while the problem seemed to go away for a while and return later.

At this point, there were a lot of eyes on the problem including some from a few levels up on the chain of command. I was sure at this point that there was some kind of process issue and that the mechanical system was solid. I’d written a troubleshooting procedure with a list of checks and cleaning operations for each of the three machine operators to do as well as two maintenance staff. On the first execution of the checklist, the problem was stopped in its tracks.

What was happening is, our polyethylene material was coming to us occasionally with more wax than usual. The web drive downstream of the one we were having problems with was building up wax and then failing to drive the web which was causing the upstream drive to overload. The machine was being cleaned according to our usual procedures and so asking if it was clean resulted in a defensive “yes.” But who gets defensive toward a checklist? It says to clean the second drive so I’ll just clean it and tick “done.”

Root Cause Analysis

Write down what the problem was as best as you can understand it. Writing will bring to light all your own gaps in understanding and force you to truly wrap your head around it. If it’s not clear, consider doing a Root Cause Analysis. I’ll probably write an article about RCAs later but my method that I learned from a past manager of mine consisted of an Ishikawa diagram, a preferential vote on likely proximate causes, a 5-why to get to root causes and then some actions in order to prevent recurrence- all done on a whiteboard or large sheet of paper in a meeting with all stakeholders. The whole reason we do the RCA is to generate actions to prevent recurrence so you need to be diligent to not let them fall by the wayside.

Report up

Never waste a good report- Since you’ve written it, send it to management! They’ll appreciate being kept in the loop.

Grow

In order to grow, you have to not wilt. If the problem was a random failure of a piece of electrical equipment and you should have had a spare, get one. That’s growing. If you thought about it ahead of time but couldn’t justify keeping a spare based on the probability of this type of failure, well, Annie Duke says even the best decision doesn’t yield the best outcome every time. Don’t be too hard on yourself. Don’t wilt. Remember that the reason you got the call was because someone thought you were their best hope in getting things back on track so hold your head up high, keep thinking and never give up.

Good luck with your next call, I hope this article helped.

1 thought on “Dealing with breakdowns

Leave a Reply

Your email address will not be published. Required fields are marked *