[dcmf] [PATCH 2/3] Use naive strided routines for af_ufs.
Robert Latham
robl at mcs.anl.gov
Fri Feb 15 12:48:50 CST 2008
On Fri, Feb 15, 2008 at 12:30:04PM -0600, Bob Cernohous wrote:
> diff --git a/lib/mpi/mpich2/src/mpi/romio/adio/ad_bgl/ad_bgl_hints.c b/lib/mpi/mpich2/src/mpi/romio/adio/ad_bgl/ad_bgl_hints.c
> index 144e722..aa933a3 100644
> --- a/lib/mpi/mpich2/src/mpi/romio/adio/ad_bgl/ad_bgl_hints.c
> +++ b/lib/mpi/mpich2/src/mpi/romio/adio/ad_bgl/ad_bgl_hints.c
> @@ -101,13 +101,24 @@ void ADIOI_BGL_SetInfo(ADIO_File fd, MPI_Info users_info, int *error_code)
> MPI_Info_set(info, "ind_wr_buffer_size", ADIOI_BGL_IND_WR_BUFFER_SIZE_DFLT);
> fd->hints->ind_wr_buffer_size = atoi(ADIOI_BGL_IND_WR_BUFFER_SIZE_DFLT);
>
> - /* default is to let romio automatically decide when to use data
> - * sieving
> - */
> - MPI_Info_set(info, "romio_ds_read", "automatic");
> - fd->hints->ds_read = ADIOI_HINT_AUTO;
> - MPI_Info_set(info, "romio_ds_write", "automatic");
> - fd->hints->ds_write = ADIOI_HINT_AUTO;
> + if(fd->file_system == ADIO_UFS)
> + {
> + /* default for ufs/pvfs is to disable data sieving */
> + MPI_Info_set(info, "romio_ds_read", "disable");
> + fd->hints->ds_read = ADIOI_HINT_DISABLE;
> + MPI_Info_set(info, "romio_ds_write", "disable");
> + fd->hints->ds_write = ADIOI_HINT_DISABLE;
> + }
> + else
> + {
> + /* default is to let romio automatically decide when to use data
> + * sieving
> + */
> + MPI_Info_set(info, "romio_ds_read", "automatic");
> + fd->hints->ds_read = ADIOI_HINT_AUTO;
> + MPI_Info_set(info, "romio_ds_write", "automatic");
> + fd->hints->ds_write = ADIOI_HINT_AUTO;
> + }
>
> fd->hints->initialized = 1;
> }
I see what you're doing here. I like that the hints are being set in
case a caller wants to examine the state of the MPI_INFO objects (it
just kills me that "bgl_nodes_pset" is invisible to end-users...)
"automatic" means that romio will do independent I/O if the file views
are contiguous and non-overlapping. This is a great heuristic for
linux clusters but i'm not so sure about bluegene. My gut says that
we should use collective I/O all the time so we can concentrate the
I/O on a few aggregators. Maybe that makes less sense now on BGP
with the io proxies.
> diff --git a/lib/mpi/mpich2/src/mpi/romio/adio/ad_ufs/ad_ufs.c b/lib/mpi/mpich2/src/mpi/romio/adio/ad_ufs/ad_ufs.c
> old mode 100755
> new mode 100644
> index ce0f6a5..a13ef78
> --- a/lib/mpi/mpich2/src/mpi/romio/adio/ad_ufs/ad_ufs.c
> +++ b/lib/mpi/mpich2/src/mpi/romio/adio/ad_ufs/ad_ufs.c
> @@ -20,8 +20,8 @@ struct ADIOI_Fns_struct ADIO_UFS_operations = {
> ADIOI_GEN_SeekIndividual, /* SeekIndividual */
> ADIOI_GEN_Fcntl, /* Fcntl */
> ADIOI_BGL_SetInfo, /* SetInfo */
> - ADIOI_GEN_ReadStrided, /* ReadStrided */
> - ADIOI_NOLOCK_WriteStrided, /* WriteStrided */
> + ADIOI_GEN_ReadStrided_naive, /*ADIOI_GEN_ReadStrided, * ReadStrided */
> + ADIOI_GEN_WriteStrided_naive, /*ADIOI_NOLOCK_WriteStrided, * WriteStrided */
> ADIOI_BGL_Close, /* Close */
> #ifdef ROMIO_HAVE_WORKING_AIO
> ADIOI_GEN_IreadContig, /* IreadContig */
NOLOCK is a litte smarter than Naiive in the "noncontig in memory
contig in file" case. In that situation, NOLOCK will perform write
combining and do fewer writes.
GEN_ReadStrided is safe for PVFS. We're just reading additional data,
not trying to do an atomic read-modify-write.
==rob
--
Rob Latham
Mathematics and Computer Science Division A215 0178 EA2D B059 8CDF
Argonne National Lab, IL USA B29D F333 664A 4280 315B
More information about the dcmf
mailing list