Some systems are experiencing issues

Past Incidents

Wednesday 28th July 2021

MongoDB shared cluster Performance issues on one node of our free MongoDB cluster

09:27 UTC - Monitoring agent on one of the nodes stopped responding then started responding again a few seconds afterwards. We put it on a random network error. 09:39 UTC - Some users start to have issues using their free add-on. 09:40 UTC - We investigate. It turns out that the mongodb process was not really listening to incoming connections anymore. 09:42 UTC - We try to restart the service. 09:46 UTC - We actually reboot the whole VM. 09:47 UTC - The VM is up and running, the mongodb process is cleaning itself up. 09:50 UTC - The mongodb process finishes its cleanup and starts accepting connections again.

Tuesday 27th July 2021

Infrastructure PAR: High load on some hypervisors, leading to applications / add-ons slowness and Monitoring/Unreachable events

Starting at 5:54 UTC and up until 6:12 UTC, some hypervisors experienced higher CPU load. This higher load may have slowed down applications and add-ons that were hosted on those hypervisors.

The root cause has been identified. Unfortunately, the higher load also triggered a lot of redeployments with the Monitoring/Unreachable cause. Most of them were cancelled on time but some of them went through. Some of the deployments that started did not correctly finish and ended-up in a blocked state. Those deployments are currently being canceled and all cancels should be over in a few minutes.

We have developed a fix that will prevent those events from happening again and it will be deploy in the next hours.

Monday 26th July 2021

No incidents reported

Sunday 25th July 2021

Infrastructure PAR: High load on some hypervisors, leading to applications / add-ons slowness and Monitoring/Unreachable events

Starting at 18:13 UTC and up until 18:43 UTC, some hypervisors experienced higher CPU load. This higher load may have slowed down applications and add-ons that were hosted on those hypervisors.

The root cause has been identified and the issue has been fixed. Unfortunately, the higher load also triggered a lot of redeployments with the Monitoring/Unreachable cause. Most of them were cancelled on time but some of them went through. Some of the deployments that started did not correctly finish and ended-up in a blocked state. Those deployments are currently being canceled and all cancels should be over in a few minutes.

We will investigate more in depth about this increased CPU load usage and see how we can better prevent this.

Saturday 24th July 2021

No incidents reported

Friday 23rd July 2021

No incidents reported

Thursday 22nd July 2021

No incidents reported