Past Incidents

Tuesday 22nd October 2024

No incidents reported

Monday 21st October 2024

Pulsar Pulsar cluster is unavailable due to ZooKeeper instabilities

Following yesterday's incident (https://www.clevercloudstatus.com/incident/911), we took actions to solve the root cause issue.

When we are performing some actions the ZooKeeper cluster becomes unstable and fails. Access logs, logs and deployments stack are affected as well as all services interacting with the Pulsar cluster.

16:20 UTC : The cluster pulsar is up and running, the deployment stack, logs is running as well, we are restarting the access logs stack.

16:30 UTC : The access logs stack is up and running.

Sunday 20th October 2024

Pulsar Pulsar cluster is an unhealthy state

The monitoring report that pulsar is in a unhealthy state, we are investigating.

16:38 UTC: there seems to be an inconsistency in the underlying bookkeeper cluster. We are looking into it.

16:40 UTC: we are now looking into the zookeeper service that seems to fail.

17:30 UTC: we have fix the zookeeper issue, and we begin the recovery process of the cluster bookeeper and then pulsar.

18:10 UTC : we are rolling open the access to the pulsar cluster.

18:45 UTC : we have rolled open the access to the pulsar cluster to half of our hypervisors.

19:15 UTC : the pulsar cluster is running and available for everyone. We are running the recovery process of the platform to ensure that every applications is up and running as well.

21:30 UTC : we have finished to redeploy applications. We are investigating the access logs stack that got offloaders errors on pulsar-side.

22:10 UTC : we have finished to restart the access logs stacks.

Saturday 19th October 2024

No incidents reported

Friday 18th October 2024

No incidents reported

Thursday 17th October 2024

No incidents reported

Wednesday 16th October 2024

No incidents reported