« Delivery issues – Telegraph magazines | Main | Missing XML »
Friday
Mar162007

XML delivery delay

We had some further production problems last night, caused by a failure in the software that creates the XML feeds. The result was that XML feeds delivery did not start till shortly after 3pm.

The failure was triggered by the increased volumes following addition of new regionals. Our testing over the previous week had been successful, but data volumes last night exposed a design fault. This has now been rectified.

We are acutely aware that recent service performance has been poor and are taking all steps we can to revert to our normal standards.

Regards

Andrew Hughes

MD, Digital

IT Report

Issue: DBCTRL was down between 00:01 and 03.10 FOR XML CLIENTS ONLY.

Customer Impact: YES

Cause: Unhandled timeout operation in DBCTRL program

Engineers: DC.

Outline.

DBCTRL was failing to distribute client XML. The error in the error log showed the message DBOut GetObjectsToDistribute Timeout expired. DBCTRL showed no other errors. The SQL server was checked and was found to be operating normally. Another delphi error indicated a possible sharing violation, so CIFS was cycled on FAS-B to force drop any ownership conflicts, however the error persisted.

We rolled back the version of DBCTRL that was released today. However the old version also reproduced the error . Further debug steps were taken using tools we have, however these showed nothing untoward happening on any of our systems. All NLA code was found to be operating normally, and all systems were nominal.

Looking at the OBJDISTREADY table (which logs object ids for distribution against PCA org ids), this had some 71000 rows in it. This is an abnormally high figure, so I took the following course of action:

Stop DBCTRL
Remove all INBOUND feed lines from DBCTRL (to stop new data being loaded into the DB)
Create Table DCTEMP
Move 71000 rows from OBJDISTREADY to DCTEMP
Truncate Table OBJDISTREADY
Moved rows with OBJ_ID = 34 back into OBJDISTREADY (4000 rows)
Start DBCTRL

At this point, the DBOUT error disappeared and all objects for ORG_ID 34 were distributed.

I then sequentially went down the list of ORG_IDs in the table DCTEMP (select distinct org_id ..... ) and manually pumped each ORG_IDs rows into the OBJDISTREADY table. As each one distributed successfully, I would load the next candidate ORG_ID. All files were successfully distributed.I stopped DBCTRL again, restated the inbound client feeds, and started it up. There were no further errors and inbound XML was being loaded with the appropriate outbound XML being distributed.