Incident Report - eClips missing content
Problem:
Several pages were missing in the eClips feed for publication date 2nd May 2013. The problem was initially discovered around 1am but due to there being less than ten missing pages at the time, it was not deemed a wider issue. It was only much later that the impact of the issue was discovered & more pages were found to be missing. An incident was declared at 9:15am & the problem was resolved by 10:30am. Missing content was then reprocessed & loaded into the eClips database with all content completed by 4pm
Cause:
A planned change took place yesterday which enabled processing of new content. Unfortunately, one of the configuration files in the change erroneously enabled a dormant part of NLA infrastructure. Despite passing through a quality checking and approval process, this subtle error went unnoticed. When the dormant system became active, it moved some content out of the production workflow, resulting in severe processing delays. The error was corrected and content was returned to the production workflow.
Solution:
In order to prevent a similar situation and impact in future, the change deployment process has been modified to include specific checks for this type of condition. Additional automated monitoring has been put in place to identify this and similar types of unexpected behaviour. Finally, the dormant parts of the system in question will be removed entirely next month.