« NLA eClips Service Incident - Report | Main | NLA eClips Service Incident: CLOSURE »
Tuesday
May202008

NLA eClips Service Incident:  REPORT

 

Problem:

The eClips Service has recently experienced two service incidents which have impacted the performance of the service.  The first incident happened Friday 16th and the second occurred Monday 19th.  Both incidents occurred during the early morning which is the peak time for usage of the eClips service. 

Cause:

After further investigation and testing, NLA engineers have discovered that these incidents were caused by a recent service release which was deployed at 19:00 Thursday 15th as part of a service enhancement. 

Service testing before the release of this enhancement unfortunately did not uncover a performance issue.  This performance issue was only found when the eClips service experienced peak traffic load on Friday 16th. 

Following the first incident on the 16th, NLA engineers were unable to determine that the service enhancement had caused the incident.  However, when on 19th a similiar incident occurred at a similiar time of day, additional information was obtained that pointed to the service release as the most likely root cause of both incidents.

Solution:

As soon the service release was determined to be the cause of the incidents, NLA engineers 'rolled back' the changes (Monday 19th at aproximately 9:30).  Once the change was removed from the service, normal operation returned and this issue has not occurred again.

NLA service release testing processes have been reviewed following these incidents and new procedures are now in place to ensure that testing more accurrately reflects the level of performance releases must cope with in the live environment. 

The release of this most recent service enhancement has now been delayed until further testing and redevelopment can occur.