« NLA eClips Email Issues | Main | NLA eClips: Emergency Maintenance »
Tuesday
Sep232008

NLA eClips Service Incident - Report

Problem:

At approximately 9:20 yesterday an incident occurred which impacted eClips service delivery. The incident was resolved at 10:15. During this incident, clients ability to view eClips content was impacted as the service was intermittently unavailable.

Cause:

The root cause of the incident is still under investigation by NLA engineers, however indications show that when attempting to serve a high number of multi-object requests containing more than the permitted 100 objects, the eClips web application spawned unnecessary additional connections to the eClips database which in turn impacted performance. Yesterday, we experienced a higher than normal number of large multi-object requests.

Solution:

Once the cause of the incident was understood by NLA engineers the service was restarted and returned to normal operation immediately. NLA engineers are now putting in place further monitoring which will provide an earlier warning of the potential reoccurrence of this type of incident. The NLA is also investigating the eClips core code related to this aspect of the service with the aim of discovering the root cause and redeveloping it to prevent reoccurrence.

Separately, the NLA engineering team is preparing to deploy a new database architecture which will be more resilient and scalable. This should also have the benefit of preventing such an incident from occurring.