[notify] 4/7/08 maintenance work complete, work continues on transaction log full issue

Tisha Stacey tstacey at alcf.anl.gov
Mon Apr 7 18:22:29 CDT 2008


Monday maintenance:
     Today's Monday maintenance period is complete, and Intrepid has
     been returned to service.

Ongoing transaction log full issue:
     We are continuing to work on the transaction log full issue.

     Description:  We are experiencing intermittent transaction log full
                   failures with the BG/P database.  If the transaction
		   log full coincides with the start or end of a job,
		   the job will fail with errors similar to the one
		   shown below (these errors will show up in the job's
		   stderr file).

                   --CLI ERROR--------------
                   cliRC = -1
                   line  = 277
                   file  = TxObject.cc

                   SQLSTATE          = 57011
                   Native Error Code = -964
		   [IBM][CLI Driver][DB2/LINUXPPC] SQL0964C  The
		   transaction log for the database is full.
                   SQLSTATE=57011
                   -------------------------

     Impact:       In the case of failure at the start of a job, you
		   will need to resubmit your job.  In the case of
		   failure at job end, your job should complete properly
		   but the updates to the status database will  fail.
		   When this happens, the partition involved in the job
		   will not be free.  Please send email to
		   support at alcf.anl.gov whenever you see a transaction
		   full error in your job error file.  The
		   administrators may need to take action to clean up
		   the partition.

If you have any questions, please feel free to contact us at
support at alcf.anl.gov.

- ALCF Support Team



More information about the intrepid-notify mailing list