We are experiencing an issue on the Metrics service which is due to an error while adding capacity to the storage cluster. We are working on it.
10:26 UTC: The ingestion issue is fixed, the system is now catching up.
10:33 UTC: The ingestion delay is almost back to normal.
10:36 UTC: There is still a bit of a lag but it should come back to normal in a few minutes. Read performance is still a bit hit or miss but coming back to normal as well. We will reopen the incident if it does not.
11:06 UTC: The ingestion lag is increasing. We are investigating. This may take a while.
11:30 UTC: The cause has been identified and partially fixed.
11:37 UTC: Lag is now <5s ; we are currently working on fixing the issue in a more permanent way.