[dcmf] [PATCH 1/1] Issue 4362: Honor the romio_ds_write and romio_ds_read hints.

Joseph Ratterman jratt at us.ibm.com
Fri Feb 8 13:57:40 CST 2008


Rob,

This patch can be applied to the git repository on the wiki if you would 
like to begin using the fix now.  It is also in our internal BGP Release 2 
repository, and it will be part of that release.

Thanks,
Joe Ratterman
jratt at us.ibm.com





Rob Ross <rross at mcs.anl.gov> 
Sent by: dcmf-bounces at lists.anl-external.org
02/08/08 01:02 PM

To
Bob Cernohous/Rochester/IBM at IBMUS
cc
DCMF <dcmf at lists.anl-external.org>
Subject
Re: [dcmf] [PATCH 1/1] Issue 4362: Honor the romio_ds_write and 
romio_ds_read hints.






Thanks Bob!

I'm still catching up, so I have some possibly stupid questions.
- When you make a change such as the one below, how/when does it 
appear on our BG/P?
- How do we know that the patch is present (other than checking new 
behavior)?

To make things operate correctly on PVFS without the user needing to 
specify a hint, we would like to use the results of statfs() at open 
time to detect the PVFS volume. I think perhaps the minimum impact 
change would be to set romio_ds_write to disable at open time when a 
PVFS file system is detected. If we created a patch to accomplish 
this, would your team be willing to integrate into BG/P releases?

In the long run, we (ANL) would really like to eliminate all the lock/ 
unlock calls in the PVFS case; they're unnecessary tree traffic that 
can be avoided. Actually many of the locks aren't needed by GPFS file 
systems either, so there's room for optimization on that side as well. 
But that can be a separate discussion.

Regards,

Rob

On Feb 8, 2008, at 12:46 PM, Bob Cernohous wrote:

> We received this request:
>
> The ROMIO implementation used on BG/P performs some optimization using
> data sieving for non-continuous individual I/O. This is done using
> read-modify-write operations that rely on file locking. This fails for
> PVFS because PVFS is lockless. The easiest solution would to make
> ADIOI_BGL_WriteStrided honor the romio_ds_write hint and fall back to
> ADIOI_GEN_WriteStrided_naive.
>
> We now honor those hints and fall back to the GEN naive write/read 
> routines.
>
> It has only been tested in so far as it does run the GEN code when 
> requested.
> We have not tested, and will not attempt to test, PVFS.
>
> Signed-off-by: Bob Cernohous <bobc at us.ibm.com>
> ---
> .../mpich2/src/mpi/romio/adio/ad_bgl/ad_bgl_read.c |   19 +++++++++++ 
> +++++++-
> .../src/mpi/romio/adio/ad_bgl/ad_bgl_write.c       |   19 +++++++++++ 
> +++++++-
> 2 files changed, 36 insertions(+), 2 deletions(-)
>
> diff --git a/lib/mpi/mpich2/src/mpi/romio/adio/ad_bgl/ad_bgl_read.c 
> b/lib/mpi/mpich2/src/mpi/romio/adio/ad_bgl/ad_bgl_read.c
> index a3aeffb..41947c9 100644
> --- a/lib/mpi/mpich2/src/mpi/romio/adio/ad_bgl/ad_bgl_read.c
> +++ b/lib/mpi/mpich2/src/mpi/romio/adio/ad_bgl/ad_bgl_read.c
> @@ -29,10 +29,10 @@ void ADIOI_BGL_ReadContig(ADIO_File fd, void 
> *buf, int count,
>     int err=-1, datatype_size, len;
>     static char myname[] = "ADIOI_BGL_READCONTIG";
>
> +#if BGL_PROFILE
>                                /* timing */
>                                double io_time, io_time2;
>
> -#if BGL_PROFILE
>                                if (bglmpio_timing) {
>                                    io_time = MPI_Wtime();
>                                    bglmpio_prof_cr[ 
BGLMPIO_CIO_DATA_SIZE ] += len;
> @@ -181,6 +181,23 @@ void ADIOI_BGL_ReadStrided(ADIO_File fd, void 
> *buf, int count,
>
>     static char myname[] = "ADIOI_BGL_READSTRIDED";
>
> +    if (fd->hints->ds_read == ADIOI_HINT_DISABLE) {
> +  /* if user has disabled data sieving on reads, use naive
> +               * approach instead.
> +               */
> +      /*FPRINTF(stderr, "ADIOI_GEN_ReadStrided_naive(%d):\n", 
> __LINE__);*/
> +      ADIOI_GEN_ReadStrided_naive(fd,
> +                                                                  buf,
> + count,
> + datatype,
> + file_ptr_type,
> + offset,
> + status,
> + error_code);
> +              return;
> +    }
> +    /*FPRINTF(stderr, "%s(%d):\n",myname, __LINE__);*/
> +
>     ADIOI_Datatype_iscontig(datatype, &buftype_is_contig);
>     ADIOI_Datatype_iscontig(fd->filetype, &filetype_is_contig);
>
> diff --git a/lib/mpi/mpich2/src/mpi/romio/adio/ad_bgl/ad_bgl_write.c 
> b/lib/mpi/mpich2/src/mpi/romio/adio/ad_bgl/ad_bgl_write.c
> index a74d0a9..6fbdb20 100644
> --- a/lib/mpi/mpich2/src/mpi/romio/adio/ad_bgl/ad_bgl_write.c
> +++ b/lib/mpi/mpich2/src/mpi/romio/adio/ad_bgl/ad_bgl_write.c
> @@ -29,10 +29,10 @@ void ADIOI_BGL_WriteContig(ADIO_File fd, void 
> *buf, int count,
>     int err=-1, datatype_size, len;
>     static char myname[] = "ADIOI_BGL_WRITECONTIG";
>
> +#if BGL_PROFILE
>                                /* timing */
>                                double io_time, io_time2;
>
> -#if BGL_PROFILE
>                                if (bglmpio_timing) {
>                                    io_time = MPI_Wtime();
>                                    bglmpio_prof_cw[ 
BGLMPIO_CIO_DATA_SIZE ] += len;
> @@ -221,6 +221,23 @@ void ADIOI_BGL_WriteStrided(ADIO_File fd, void 
> *buf, int count,
>     int new_bwr_size, new_fwr_size, err_flag=0, info_flag, 
> max_bufsize;
>     static char myname[] = "ADIOI_BGL_WRITESTRIDED";
>
> +    if (fd->hints->ds_write == ADIOI_HINT_DISABLE) {
> +              /* if user has disabled data sieving on reads, use naive
> +               * approach instead.
> +               */
> +      /*FPRINTF(stderr, "ADIOI_GEN_WriteStrided_naive(%d):\n", 
> __LINE__);*/
> +      ADIOI_GEN_WriteStrided_naive(fd,
> +                                                                  buf,
> + count,
> + datatype,
> + file_ptr_type,
> + offset,
> + status,
> + error_code);
> +              return;
> +    }
> +    /*FPRINTF(stderr, "%s(%d):\n",myname, __LINE__);*/
> +
>     ADIOI_Datatype_iscontig(datatype, &buftype_is_contig);
>     ADIOI_Datatype_iscontig(fd->filetype, &filetype_is_contig);
>
> -- 
> 1.5.3.7
>
> _______________________________________________
> dcmf mailing list
> dcmf at lists.anl-external.org
> http://lists.anl-external.org/cgi-bin/mailman/listinfo/dcmf
> http://dcmf.anl-external.org/wiki
>

_______________________________________________
dcmf mailing list
dcmf at lists.anl-external.org
http://lists.anl-external.org/cgi-bin/mailman/listinfo/dcmf
http://dcmf.anl-external.org/wiki

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alcf.anl.gov/pipermail/dcmf/attachments/20080208/d5fb3d4c/attachment.htm>


More information about the dcmf mailing list