« NLA eClips Service Incident - Report | Main | NLA eClips Service Incident Report »
Saturday
Oct182008

NLA eClips Service Incident - Report

 

Problem:

 

Certain eClips customers had intermittent access to NLA web and FTP services from 7:35am to 9:00am and from 9:19am to 9:33am on Saturday 18th October 2008.

 

Cause:

 

The owners of NLA's London hosting facility were carrying out the first phase of a planned, annual, power-down exercise on Saturday 18th October. This involved disabling one of the two power feeds which supply the NLA infrastructure. The NLA's infrastructure can usually tolerate removal of one power feed as it has a dual-fed, clustered architecture. In this instance, the automated failover of one clustered network component did not complete successfully.

 

Solution:

 

The failover process for the affected network component required manual intervention by engineers, who ensured that it completed successfully. The engineers also made some configuration changes to the cluster which should reduce the risk of a similar event occurring in future.