Thursday 21st March 2024

Logs System Logs drains are down

(times in UTC)

Around 21:00, a part of the logs drains stack broke in a way that our monitoring did not see right away. It started to fill up the disk of the underlying RabbitMQ. At 21:37, We were alerted by the lack of space on RabbitMQ. We started investigating it around 22:10. At 22:57: the log drain stack is back up! However, to fix the RabbitMQ, we had to drop the pending queues. Our logs are still collected in our new logs infrastructure, but all drains lost the logs between 21:00 and 22:57.