Tuesday 27th July 2021

Infrastructure PAR: High load on some hypervisors, leading to applications / add-ons slowness and Monitoring/Unreachable events

Starting at 5:54 UTC and up until 6:12 UTC, some hypervisors experienced higher CPU load. This higher load may have slowed down applications and add-ons that were hosted on those hypervisors.

The root cause has been identified. Unfortunately, the higher load also triggered a lot of redeployments with the Monitoring/Unreachable cause. Most of them were cancelled on time but some of them went through. Some of the deployments that started did not correctly finish and ended-up in a blocked state. Those deployments are currently being canceled and all cancels should be over in a few minutes.

We have developed a fix that will prevent those events from happening again and it will be deploy in the next hours.