Sunday 25th July 2021

Infrastructure PAR: High load on some hypervisors, leading to applications / add-ons slowness and Monitoring/Unreachable events

Starting at 18:13 UTC and up until 18:43 UTC, some hypervisors experienced higher CPU load. This higher load may have slowed down applications and add-ons that were hosted on those hypervisors.

The root cause has been identified and the issue has been fixed. Unfortunately, the higher load also triggered a lot of redeployments with the Monitoring/Unreachable cause. Most of them were cancelled on time but some of them went through. Some of the deployments that started did not correctly finish and ended-up in a blocked state. Those deployments are currently being canceled and all cancels should be over in a few minutes.

We will investigate more in depth about this increased CPU load usage and see how we can better prevent this.