Overview -
Signagelive Platform Outage (March 6th 2025) - Root Cause Analysis
βBrief Summary -
Yesterday, on March 6th, between 11:37 AM GMT and 15:14 PM GMT, our Signagelive (London instance) experienced a platform outage that affected users' ability to access Signagelive. During this time, users may have encountered such issues as:
Inability to access to the Signagelive User Interface
Media Players would not be able to connect to our server
Media Players failing to download new content
SSO (Federated Users) unable to login
Message Manager Users were unable to upload images and/or save feeds.
Root Cause Identification
The system outage was caused by ourΒ AWS transition (24th March 2025). When we switched one of our APIs, regrettably, we misconfigured the database connection. This misconfiguration led to errors in some replication targets, leading to stale data being returned.
This caused a delay in replication into Amazon Web Services (AWS), as some records had duplicate IDs.
Due to these issues, the Signagelive User Interface will be taken down at approximately 12:51 PM GMT to ensure the best possible opportunity for resolution.
Corrective Actions Taken
To resolve this issue, we rolled back part of our migration to use known good replication targets and scaled up the database cluster.
14:07 PM GMT - Services began to be restored under monitoring.
15:13 PM GMT - The Signagelive User Interface and services were restored.
Timeline of Events
The following updates highlight the key updates provided during the issue:
π° / Update: 7th March 06:30 GMT | All services have been restored for some time, per the status updates below. Our monitoring and customer reports indicate that everything has remained generally stable.
If users experience pages not loading in the user interface, please try to clear your internet browser's cache to see if that immediately addresses the problem. As always, please escalate any issues encountered to our Support Team by emailing [email protected]
Thank you for your continued support, patience, and understanding. We sincerely apologise for any inconvenience caused by this service outage. |
π° / Update: 6th March 15:24 GMT | While all services had previously been restored at 15:13 GMT, the Legacy PC Player API was still being examined.
The Legacy PC Player API has now been restored as well. |
π° / Update: 6th March 14:58 GMT | Our background queues are almost caught up, and weβve re-enabled Player Connectivity. Assuming the Player Connectivity remains fine, the Development Team will re-enable the User Interface/s.
We will confirm here if that occurs within the next 30 minutes. |
π° / Update: 6th March 14:11 GMT | We have isolated the cause of the issue and are currently bringing up background services. Once all queued jobs have been processed, we will re-enable player connectivity and the User Interface. |
π° / Update: 6th March 13:46 GMT | We continue to investigate these issues as the highest priority. Rest assured that our team is working hard to investigate and resolve these issues, and we appreciate your patience and understanding. |
π° / Update: 6th March 12:53 GMT | We have taken down the Signagelive User Interface while investigating these issues. We feel this is the best way to resolve these matters.
This will appear as the following image when attempting to gain access: |
π° / Update: 6th March 12:45 GMT | We continue to investigate these issues and would like to thank everybody for their continued patience. We are aware of the following issues:
Confirmed Issues
Unconfirmed Reports
We will continue to keep you closely updated. |
π° / Update: 6th March 12:15 GMT | We're currently investigating known issues with the Signagelive platform; here's how this issue might look for you:
These issues are regrettably related to the background work related to our AWS transition (more information here).
Thank you for your patience while we investigate this matter. We will update this article as soon as we have new information. |
For More Information
For information on the current system status of Signagelive, please check out our system status page. The summary of our investigation will be posted here when the issues have ended.