General downtime

Incident Report for Happo

Postmortem

The root cause of the downtime and slowness from the API was a new database query that we deployed a few days ago. Once it had some traffic we started noticing slow queries coming from the new query. It took us a about an hour to track everything down, and we quickly reverted the code that added the new query. We will continue to monitor things but as of right now the system is stable.

Posted Jun 23, 2025 - 16:31 UTC

Resolved

We are continuing to monitor an issue which caused most Happo jobs and API calls to fail. We are seeing things recover right now and will keep an eye on this to make sure we're not regressing again.

Issue started at 15:00 UTC and was ongoing until 15:43 UTC.

Posted Jun 23, 2025 - 15:59 UTC

This incident affected: API, Web UI and Workers (Chrome, Firefox, Edge, Safari, iOS Safari, iOS Safari (iPad)).