[intrepid-notify] Intrepid / Eureka System Status
William Scullin
wscullin at alcf.anl.gov
Mon May 24 20:38:06 CDT 2010
Dear users,
Somewhere around 6:45 PM our site chiller plant went down,
temperatures in our facility's machine room went up, and in response
around 7:30 PM, an emergency control script killed all running jobs,
spun down drives, and shut down storage controllers. All running jobs
and file system connectivity from all logins has been lost. The
chiller plant is back online and temperatures are coming down. We are
monitoring the situation and once the temperatures are back in control
we will start bringing everything back up. We expect to be back in
production before midnight USCT. We apologize for the inconvenience
this disruption in service may have caused you. Please contact
support at alcf.anl.gov with any questions, comments, or concerns.
Thank you,
The ALCF Blue Gene Support Team
More information about the intrepid-notify
mailing list