[dcmf] [PATCH 1/1] Issue 4362: Honor the romio_ds_write and romio_ds_read hints.

Rob Ross rross at mcs.anl.gov
Fri Feb 8 13:02:03 CST 2008


Thanks Bob!

I'm still catching up, so I have some possibly stupid questions.
- When you make a change such as the one below, how/when does it  
appear on our BG/P?
- How do we know that the patch is present (other than checking new  
behavior)?

To make things operate correctly on PVFS without the user needing to  
specify a hint, we would like to use the results of statfs() at open  
time to detect the PVFS volume. I think perhaps the minimum impact  
change would be to set romio_ds_write to disable at open time when a  
PVFS file system is detected. If we created a patch to accomplish  
this, would your team be willing to integrate into BG/P releases?

In the long run, we (ANL) would really like to eliminate all the lock/ 
unlock calls in the PVFS case; they're unnecessary tree traffic that  
can be avoided. Actually many of the locks aren't needed by GPFS file  
systems either, so there's room for optimization on that side as well.  
But that can be a separate discussion.

Regards,

Rob

On Feb 8, 2008, at 12:46 PM, Bob Cernohous wrote:

> We received this request:
>
> The ROMIO implementation used on BG/P performs some optimization using
> data sieving for non-continuous individual I/O. This is done using
> read-modify-write operations that rely on file locking. This fails for
> PVFS because PVFS is lockless. The easiest solution would to make
> ADIOI_BGL_WriteStrided honor the romio_ds_write hint and fall back to
> ADIOI_GEN_WriteStrided_naive.
>
> We now honor those hints and fall back to the GEN naive write/read  
> routines.
>
> It has only been tested in so far as it does run the GEN code when  
> requested.
> We have not tested, and will not attempt to test, PVFS.
>
> Signed-off-by: Bob Cernohous <bobc at us.ibm.com>
> ---
> .../mpich2/src/mpi/romio/adio/ad_bgl/ad_bgl_read.c |   19 +++++++++++ 
> +++++++-
> .../src/mpi/romio/adio/ad_bgl/ad_bgl_write.c       |   19 +++++++++++ 
> +++++++-
> 2 files changed, 36 insertions(+), 2 deletions(-)
>
> diff --git a/lib/mpi/mpich2/src/mpi/romio/adio/ad_bgl/ad_bgl_read.c  
> b/lib/mpi/mpich2/src/mpi/romio/adio/ad_bgl/ad_bgl_read.c
> index a3aeffb..41947c9 100644
> --- a/lib/mpi/mpich2/src/mpi/romio/adio/ad_bgl/ad_bgl_read.c
> +++ b/lib/mpi/mpich2/src/mpi/romio/adio/ad_bgl/ad_bgl_read.c
> @@ -29,10 +29,10 @@ void ADIOI_BGL_ReadContig(ADIO_File fd, void  
> *buf, int count,
>     int err=-1, datatype_size, len;
>     static char myname[] = "ADIOI_BGL_READCONTIG";
>
> +#if BGL_PROFILE
> 		/* timing */
> 		double io_time, io_time2;
>
> -#if BGL_PROFILE
> 		if (bglmpio_timing) {
> 		    io_time = MPI_Wtime();
> 		    bglmpio_prof_cr[ BGLMPIO_CIO_DATA_SIZE ] += len;
> @@ -181,6 +181,23 @@ void ADIOI_BGL_ReadStrided(ADIO_File fd, void  
> *buf, int count,
>
>     static char myname[] = "ADIOI_BGL_READSTRIDED";
>
> +    if (fd->hints->ds_read == ADIOI_HINT_DISABLE) {
> +  /* if user has disabled data sieving on reads, use naive
> +	 * approach instead.
> +	 */
> +      /*FPRINTF(stderr, "ADIOI_GEN_ReadStrided_naive(%d):\n",  
> __LINE__);*/
> +      ADIOI_GEN_ReadStrided_naive(fd,
> +				    buf,
> +				    count,
> +				    datatype,
> +				    file_ptr_type,
> +				    offset,
> +				    status,
> +				    error_code);
> +    	return;
> +    }
> +    /*FPRINTF(stderr, "%s(%d):\n",myname, __LINE__);*/
> +
>     ADIOI_Datatype_iscontig(datatype, &buftype_is_contig);
>     ADIOI_Datatype_iscontig(fd->filetype, &filetype_is_contig);
>
> diff --git a/lib/mpi/mpich2/src/mpi/romio/adio/ad_bgl/ad_bgl_write.c  
> b/lib/mpi/mpich2/src/mpi/romio/adio/ad_bgl/ad_bgl_write.c
> index a74d0a9..6fbdb20 100644
> --- a/lib/mpi/mpich2/src/mpi/romio/adio/ad_bgl/ad_bgl_write.c
> +++ b/lib/mpi/mpich2/src/mpi/romio/adio/ad_bgl/ad_bgl_write.c
> @@ -29,10 +29,10 @@ void ADIOI_BGL_WriteContig(ADIO_File fd, void  
> *buf, int count,
>     int err=-1, datatype_size, len;
>     static char myname[] = "ADIOI_BGL_WRITECONTIG";
>
> +#if BGL_PROFILE
> 		/* timing */
> 		double io_time, io_time2;
>
> -#if BGL_PROFILE
> 		if (bglmpio_timing) {
> 		    io_time = MPI_Wtime();
> 		    bglmpio_prof_cw[ BGLMPIO_CIO_DATA_SIZE ] += len;
> @@ -221,6 +221,23 @@ void ADIOI_BGL_WriteStrided(ADIO_File fd, void  
> *buf, int count,
>     int new_bwr_size, new_fwr_size, err_flag=0, info_flag,  
> max_bufsize;
>     static char myname[] = "ADIOI_BGL_WRITESTRIDED";
>
> +    if (fd->hints->ds_write == ADIOI_HINT_DISABLE) {
> +    	/* if user has disabled data sieving on reads, use naive
> +	 * approach instead.
> +	 */
> +      /*FPRINTF(stderr, "ADIOI_GEN_WriteStrided_naive(%d):\n",  
> __LINE__);*/
> +      ADIOI_GEN_WriteStrided_naive(fd,
> +				    buf,
> +				    count,
> +				    datatype,
> +				    file_ptr_type,
> +				    offset,
> +				    status,
> +				    error_code);
> +    	return;
> +    }
> +    /*FPRINTF(stderr, "%s(%d):\n",myname, __LINE__);*/
> +
>     ADIOI_Datatype_iscontig(datatype, &buftype_is_contig);
>     ADIOI_Datatype_iscontig(fd->filetype, &filetype_is_contig);
>
> -- 
> 1.5.3.7
>
> _______________________________________________
> dcmf mailing list
> dcmf at lists.anl-external.org
> http://lists.anl-external.org/cgi-bin/mailman/listinfo/dcmf
> http://dcmf.anl-external.org/wiki
>




More information about the dcmf mailing list