[intrepid-notify] Intrepid / Eureka System Status

William Scullin wscullin at alcf.anl.gov
Mon May 24 20:38:06 CDT 2010


Dear users,

Somewhere around 6:45 PM our site chiller plant went down,
temperatures in our facility's machine room went up, and in response
around 7:30 PM, an emergency control script killed all running jobs,
spun down drives, and shut down storage controllers. All running jobs
and file system connectivity from all logins has been lost. The
chiller plant is back online and temperatures are coming down.  We are
monitoring the situation and once the temperatures are back in control
we will start bringing everything back up. We expect to be back in
production before midnight USCT. We apologize for the inconvenience
this disruption in service may have caused you. Please contact
support at alcf.anl.gov with any questions, comments, or concerns.

Thank you,

The ALCF Blue Gene Support Team


More information about the intrepid-notify mailing list