[dcmf] [PATCH 1/1] Issue 4362: Honor the romio_ds_write and romio_ds_read hints.
Robert Latham
robl at mcs.anl.gov
Fri Feb 8 16:22:17 CST 2008
On Fri, Feb 08, 2008 at 01:56:43PM -0600, Bob Cernohous wrote:
> In this case ... are we talking about modifying BGL open()? Or a higher
> level? Isn't a full solution really to build ad_pvfs and select ad_pvfs
> vs ad_bgl at some higher level (with prefixes/hints/etc or statfs)? I
> don't think we want ad_bgl trying to statfs() and select different file
> system behaviors at our level. Show us a patch and we can comment
> better.
Thanks for the romio_ds_write addition, Bob
No worries about the PVFS comment. Did have a few questions, though.
Do you know why ad_bgl is based off of ad_nfs and not ad_ufs? The
aggressive locking in ad_nfs is an attempt to flush client-side
caches, but shouldn't be needed for other file systems.
ad_pvfs2 as it exists in ROMIO won't work on BlueGene -- we've tried.
The BG compute node kernels lack a couple important system calls
needed for the PVFS client-side libraries.
PVFS has thus far silently (and incorrectly) let any file locks from
ROMIO suceed. This is the real crux of issue 4362 -- multiple I/O
nodes think they have exclusive write locks on the same region of the
file, resulting in corrupted data. We can bypass the locks in this
situation with a hint, but there could be others we haven't found (or
diagnosed) yet.
The proper long term fix for PVFS is to make sure we have a path
through the MPI-IO library that never attempts to lock the file.
- Start with ADIOI_GEN_WriteStrided
- check fs type (statfs)
- check access type (noncontig/conting in mem/file)
- noncontig in mem, noncontig in file: naive access
- contig in mem, noncontig in file: naive
- (optimization) noncontig in mem, contig in file: use
write-combining approach like that in ad_pvfs/ad_pvfs_write.c.
buffer each noncontiguous memory region into larger contiguous
region.
- contig in mem, contig in file: already handled
Thanks
==rob
--
Rob Latham
Mathematics and Computer Science Division A215 0178 EA2D B059 8CDF
Argonne National Lab, IL USA B29D F333 664A 4280 315B
More information about the dcmf
mailing list