Tuesday 10th December 2019

Metrics / AccessLogs Metrics: Up-to-date metrics are delayed

Metrics collection currently has some troubles and up-to-date metrics are not available since ~30 minutes. All the metrics are stored but won't be retrievable. We are looking into it.

18:42 UTC: We are still working on it. This is a never-before-seen, massive issue so we are unable to give any ETA at this time.

22:35 UTC: The issue has been narrowed down and is now under resolution. We will wait until tomorrow morning to continue restoring this service. All metrics gathered before this incident are still accessible, only new metrics are not. Those are currently stored and will be processed once the Metrics cluster goes back to normal. More news tomorrow morning.

12:00 UTC: We have been back working on this since 7:30 UTC, things are looking good; still at least a few hours to go.

13:55 UTC: The issue with the storage platform is now finally fixed. The ingestion is now running at full speed and catching up; it's processing the 22 hours of data which have been accumulating.

15:25 UTC: We are about halfway there.

16:50 UTC: We are 4/5 of the way there. It should be resolved in under an hour.

17:30 UTC: You should now already see recent points in your applications' metrics. Delay will be back to normal in less than 30 minutes. Closing this off.