Past Incidents

Saturday 18th March 2023

No incidents reported

Friday 17th March 2023

No incidents reported

Thursday 16th March 2023

Reverse Proxies [PAR] Reverse proxies instabilities

We are currently investigating reverse proxies instabilities on our Paris zone.

EDIT 18:56 UTC: To be more specific about the instabilities, the connections were slower to be processed, increasing the response time, sometimes drastically. The root cause has been found and fixed at 18:42 UTC. Since then, everything is back to normal. We continue to monitor the situation.

EDIT 19:11 UTC: Additional investigation will be performed to pinpoint the exact cause of the problem and measures will be added to prevent it from happening again. Sorry for the inconvenience.

Wednesday 15th March 2023

API Maintenance on main Clever Cloud API

The main Clever Cloud API will go under maintenance for about 30 minutes, starting at 21:00 UTC.

During these 30 minutes, some deployments may not go through. Some calls may fail.

Everything seems to have gone well. The operation was over at 21:28.

EDIT 23:15 UTC: It seems like some application creation are having issues following this change, we are investigating.

EDIT 00:10 UTC: A fix has been implemented and applications are now correctly created. Some users may have had the API answer a 200 - OK for application creation but following requests for that application would return a 404 - Not Found. Sorry for the inconvenience.

Tuesday 14th March 2023

Reverse Proxies One PAR reverse proxy is not responding

(All times UTC)

At 20:10 one of the 4 reverse proxies on zone PAR stops responding to some requests. No internal metrics changed, no weird logs were written. The requests would just time out. The other three were still running, so the requests errors were random.
At 20:25 it stops responding at all.
At 20:40 our external monitoring tool alerts us. We investigate, find which reverse proxy failed, restarted it.
At 20:43 the reverse proxy is restarted and traffic goes fine.

Monday 13th March 2023

No incidents reported

Sunday 12th March 2023

MongoDB shared cluster Free MongoDB cluster on PAR unreachable

(All times in UTC)

16:30 we started seeing alerts about high load on the primary node. 17:00 we started getting report about the cluster being unreachable. 18:00 after checking the cluster, we decided to restart the primary node.

Data may have been lost as the node was not writing / replicating correctly. We are still waiting for the primary node to restart. The secondary does not seem to elect itself as primary.

19:30 the secondary finally got promoted as primary. We are blocking users with unfair use of the cluster. 22:45 we detect that the node we restarted failed to get back in the cluster. We decide to remove it entirely and re-create that node from scratch. 2023-03-13 10:00 the node has fully reached the "SECONDARY" state. We put it back into production.

Measures have been taken to prevent future unfair use from users.