502 Bad Gateway Again

We are receiving the 502 Bad Gateway again. It seems this is a common issue that is affecting multiple people. We usually get this error once a day, and it doesn’t last for long but it grinds all production to a halt.

@Andy Our engineers were aware of a short period (~7:50-8:05 a.m. today) where your server was struggling with a higher than normal load (typically the cause of 502 errors), however the server self corrected and we’ve not received reports of continued/widespread issues from anyone else.

We will pass this along to our engineers so that we can continue making improvements to help prevent these sorts of issues. In the meantime, since you’ve stated that this happens daily, if there’s any type of pattern you’re able to identify around what’s happening in your Cetec environment when the error happens that would be helpful for our engineers to know. Could be that it happens around the same time every day, or that there’s a series of actions that take place just before you see the issue each time, etc.

Please let us know if you have any additional questions or information about it. Thanks!

It’s hard to pin down because we have around 65 users each doing something different. But I will continue to try to find patterns.

We are receiving the 502 error again this morning. It is currently down now

@Andy Our engineers saw the spike in server activity, and had to restart your server in order to resolve the issues.

They reported that in the span of around 5 minutes there were about 9000 requests made on the production order list page. Any idea what would have cause all that activity on that page?

We only have 50-60 users so even if all of them accessed that page at the same time we couldn’t possibly have made 9000 requests.

Yesterday we had this error 3 times throughout the day. We are just getting this error again currently. It lasts about 10 minutes. Any idea what can we do to permanently fix this? Do we need to be on our own server???

Thanks,

Andy

Hi @Andy,

We will check with our engineers and see if they can point to a root cause at all, and offer any suggestions on how to prevent it moving forward.

Thanks!

@Andy We’ve got some good news for you here.

First, I’ll let you know that our engineers reiterated that the root of the issue is that somehow your users are somehow clicking submit on the Production Order List many many many many times in a row, which locks all the database tables & makes the problem worse.

Now, for the good news:

  • They’ve found a couple of things that can be changed in the database that will make searching the production order list faster overall, and should provide some immediate short term relief.
  • I the next week or so they’re going to be disabling the ability for any user to click submit there multiple times. This will limit each click to only make a single request of the server, and then prevent more requests from being made until the original has finished and the page has reloaded. They think that this will solve the issue for you all in the long run.

Hopefully that provides some light at the end of the tunnel here. Let us know if you have more questions about it.

We have received the 502 Bad Gateway several times in the last few days. Is there anything we can or need to do on our end to remedy? Thanks!!!

Unless your users are doing something unusual that’s causing a spike in the server activity, there’s nothing that you all can do.

Our engineers monitor server activity, usually by the time you’ve contacted us about it they’re already aware and working to resolve it, and typically the downtime is negligible.