All systems are operational

Past Incidents

Thursday 17th June 2021

No incidents reported

Wednesday 16th June 2021

Infrastructure PAR: Network accessibility issue

We are currently experiencing a network accessibility issue on our PAR zone. We are investigating.

EDIT 12:21 UTC+2: Our network provider is looking into the issue.

EDIT 12:28 UTC+2: Deployments on other zones might not correctly work. But traffic shouldn't be impacted.

EDIT 12:30 UTC+2: Network connectivity seems to be back. We are awaiting confirmation of incident resolution from our network provider.

EDIT 12:35 UTC+2: Our network provider found the issue and fixed it. Network is back online since 12:30 UTC+2. Investigation will be conducted to understand why the secondary link hasn't been used.

EDIT 12:42 UTC+2: A postmortem will be made available later once everything has been figured out.

EDIT 12:50 UTC+2: The deployment queue is currently processing, queued deployments might take a few minutes to start

EDIT 13:00 UTC+2: Logs may also be unavailable depending on the applications

EDIT 13:20 UTC+2: Deployment queue still has a lot of items, the build cache feature is currently having troubles which slows down deployments.

EDIT 14:33 UTC+2: Deployments queue is now lower but there are still some issues with some of them. Logs are also partially available

EDIT 15:30 UTC+2: The build cache feature still has troubles, we are currently working on a workaround. Logs should now be back but there is a delay in processing which might affect availability on the Console / CLI. They might be a few minutes late.

EDIT 16:04 UTC+2: Some applications linked to FSBuckets systems might have lost their connection to the FSBucket, increasing their I/O and possibly rebooting in a loop for either Monitoring/Unreachable or Monitoring/Scalability. This can cause response timeouts, especially for PHP applications

EDIT 16:16 UTC+2: Build cache should be fixed, meaning that deployments should take less time

EDIT 16:53 UTC+2: There is still a lot of Monitoring/Unreachable events that are being sent, making a lot of application redeploy for no good reason. We are still working on it.

EDIT 17:18 UTC+2: The issue with Monitoring/Unreachable events has been fixed. The size of the deployments queue should go down.

EDIT 18:07 UTC+2: Most issues haves been cleared up. PHP applications may still be experiencing issues, we are working on it. If you are experiencing issues on non-PHP applications, please contact us.

EDIT 19:05 UTC+2: All PHP applications have been redeployed. If you are still experiencing issues, please contact us. All other applications which have not already been redeployed since the beginning of the incident will be redeployed in the next few hours (to make sure no apps are stuck in a weird state).

Tuesday 15th June 2021

Planned maintenance on Metrics storage backend, scheduled 1 day ago

Planned maintenance of the storage backend of Clever Cloud Metrics (used for access logs as well) will occur on 2021-06-15 at 11:30 UTC.

The maintenance itself should take no more than an hour. During this time, writes will be queued and reads will be partially available.

Once the maintenance is over, queued-up writes will start being ingested, reads will be available again (except for recent data until queued-up data points are ingested).

11:36 UTC: Maintenance is starting.

12:04 UTC: Maintenance is over. The ingestion pipeline is running at full speed catching up on the queued-up data.

12:18 UTC: Ingestion is caught up.

Monday 14th June 2021

No incidents reported

Sunday 13th June 2021

Reverse Proxies Public reverse proxies issues

We are experiencing issues with public reverse proxies.

EDIT 16:58 UTC: we mitigated the issues.

Saturday 12th June 2021

No incidents reported

Friday 11th June 2021

Reverse Proxies PAR: Reverse proxies instability

Reverse proxies on the Paris zone are experiencing instabilities. We are investigating.

EDIT 18:04 UTC+2: One of the reverse proxy stopped accepting new connections. It has been put out of the pool for further investigation. Stability should have been resumed since 2 minutes.

EDIT 18:18 UTC+2: Performance is back to normal. We are going to investigate further why this reverse proxy went into this state without being noticed.

MongoDB shared cluster MongoDB shared cluster on Paris zone is overloaded

MongoDB shared cluster on Paris zone is overloaded. We are investigating what is most likely due to excessive ressources usage of some users.

As a reminder, this cluster is only used by free plans labeled "DEV". This is meant to be used for development and testing purposes only, not production.

If you are using a free plan in production, we suggest you migrate to a dedicated plan using the migration tool in the Clever Cloud console.

10:43 UTC: The cluster is working fine now although it may be slower than usual for now as a node is out of the cluster and will be re-added later.

12:23 UTC: The node mentioned in the last update has been re-added. The incident is over.