[intrepid-notify] ALCF intrepid-fs0 File System Update

Michael E. Papka papka at anl.gov
Wed Dec 5 18:27:08 CST 2012

Dear ALCF Community:

First, I want to say that I am deeply sorry for our system issues that continue to impact your science. I can assure you that ALCF is taking this matter very seriously, and I’d like to bring you up to date about the steps we are taking to address it.

We discovered during the process of adding additional storage to Intrepid that specific files were corrupted (cross-linked). Analysis so far has confirmed that between 30 and 105 files were affected, and a portion of these files may still be recoverable.

We have been running a series of file system checks (fsck). Doing this on a 5PB file system with 360 million files is non-trivial. We will do at least one more fsck, which we know will take at least one week to complete. If this timetable changes, you will be notified immediately.

We realize that for some of you the 2012 INCITE program is coming to a close. The ALCF staff is working on measures to address your access to your data, and how to accommodate the balance of your allocation. 

I appreciate your patience and ALCF will continue to send you updates on Monday, Wednesday, and Friday, until this issue is resolved.

If you have any questions please don't hesitate to contact me, your Catalyst, or the ALCF help desk (support at alcf.anl.gov).


Michael E. Papka
Division Director
Argonne Leadership Computing Facility
Argonne National Laboratory
papka at anl.gov

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alcf.anl.gov/pipermail/intrepid-notify/attachments/20121205/8724a648/attachment.html>

More information about the intrepid-notify mailing list