On August 29, 2025, a number of Equiem services experienced intermittent outages over approximately 30 minutes, impacting several of our sites. The incident was caused by a database connection issue within our Segmentation service. We traced the root cause to a recently deployed code change where a data loading function was incorrectly instantiated, which prevented it from batching database queries. This caused a surge in database connections that exceeded our system's capacity, leading to temporary service disruption. The system self-healed, and we have since prepared a permanent fix to prevent a recurrence.
The outage was caused by a recently deployed code change in our Segmentation service. The change instantiated multiple DataLoaders, rather than reusing a single instance per request. The DataLoader is designed to batch multiple database requests into a single, more efficient query. By instantiating it multiple times per request, we created significantly more database connections, which quickly exhausted our database's connection limit. When the limit was reached, new connection requests were denied and all service containers terminated, resulting in the site outages. Our load balancer replaced the failed services and the system self-healed.
We sincerely apologise for any disruption this caused. If you have any questions or concerns, please don't hesitate to reach out to our support team at support@getequiem.com.