[dcmf] Re: [PATCH] This test will repeatedly call the low-level critical-section functions for performance testing.

Pavan Balaji balaji at mcs.anl.gov
Tue Feb 5 12:23:42 CST 2008


Each request gets a lock when doing the Isend or the Irecv. For the 
Waitall, any number of requests can be completed on one lock, so it's 
not very clear how many actually happen in the test. I guess for the 
PROC_NULL case, it should be all the requests that are pending.

  -- Pavan

On 02/05/2008 12:02 PM, Sameer Kumar wrote:
> Joe,
>       If the lock is called once per message transaction the rate will be
> 4.7 MMPS,  if its called twice it will  be 2.3 MMPS and  four times 1.2
> MMPS.  So we need to investigate how many times the lock is called per
> procnull message send and recv.  May be someone in Argonne can answer that.
> 
> 
>                   sameer.
> 
> 
> 
> 
>                                                                            
>              Joseph                                                        
>              Ratterman/Rochest                                             
>              er/IBM at IBMUS                                               To 
>              Sent by:                  Joseph                              
>              dcmf-bounces at list         Ratterman/Rochester/IBM at IBMUS       
>              s.anl-external.or                                          cc 
>              g                         DCMF <dcmf at lists.anl-external.org>  
>                                                                    Subject 
>                                        [dcmf] Re: [PATCH] This test will   
>              02/05/2008 12:23          repeatedly call the low-level       
>              PM                        critical-section functions for      
>                                        performance testing.                
>                                                                            
>                                                                            
>                                                                            
>                                                                            
>                                                                            
>                                                                            
> 
> 
> 
> 
> 
> Here are the results from running this test:
> 
> $ mpirun -np 1 -nofree -mode SMP ./build-tests/perf/dcmf/CS.cnk
> DCMF_THREAD_SINGLE: Called Enter/Exit 10000 times at 78.0386 cycles each.
> DCMF_THREAD_FUNNELED: Called Enter/Exit 10000 times at 78.0054 cycles each.
> 
> DCMF_THREAD_SERIALIZED: Called Enter/Exit 10000 times at 78.0029 cycles
> each.
> DCMF_THREAD_MULTIPLE: Called Enter/Exit 10000 times at 180.063 cycles each.
> 
> $ mpirun -np 1 -nofree -mode DUAL ./build-tests/perf/dcmf/CS.cnk
> DCMF_THREAD_SINGLE: Called Enter/Exit 10000 times at 78.0378 cycles each.
> DCMF_THREAD_FUNNELED: Called Enter/Exit 10000 times at 78.0032 cycles each.
> 
> DCMF_THREAD_SERIALIZED: Called Enter/Exit 10000 times at 78.0019 cycles
> each.
> DCMF_THREAD_MULTIPLE: Called Enter/Exit 10000 times at 196.044 cycles each.
> 
> 
> While this is a doubling in the time it takes to lock/unlock, that alone
> wouldn't drop the one process/thread performance from 4.47 to 1 MMPS.  We
> will look into it more after we get the benchmark.
> 
> 
> Thanks,
> Joe Ratterman
> 
> 
> 
>                                                                            
>  Joseph                                                                    
>  Ratterman/Rochester/IBM@                                                  
>  IBMUS                                                                     
>                                                                         To 
>                                    DCMF <dcmf at lists.anl-external.org>      
>  02/05/08 11:19 AM                                                      cc 
>                                    Joseph Ratterman/Rochester/IBM at IBMUS    
>                                                                    Subject 
>                                    [PATCH] This test will repeatedly call  
>                                    the low-level critical-section          
>                                    functions for performance testing.      
>                                                                            
>                                                                            
>                                                                            
>                                                                            
>                                                                            
>                                                                            
>                                                                            
> 
> 
> 
> 
> 
> This is helpful when trying to understand performance degradations in
> MPI_THREAD_MULTIPLE.
> 
> Signed-off-by: Joe Ratterman <jratt at us.ibm.com>
> ---
> sys/tests/perf/Makefile.in            |    2 +-
> sys/tests/perf/dcmf/CS.c              |   64 ++++++++++++++++++++++++++++++
> +++
> sys/tests/perf/{ => dcmf}/Makefile.in |    4 +-
> 3 files changed, 67 insertions(+), 3 deletions(-)
> create mode 100644 sys/tests/perf/dcmf/CS.c
> copy sys/tests/perf/{ => dcmf}/Makefile.in (96%)
> 
> diff --git a/sys/tests/perf/Makefile.in b/sys/tests/perf/Makefile.in
> index 22a7ac7..4266989 100644
> --- a/sys/tests/perf/Makefile.in
> +++ b/sys/tests/perf/Makefile.in
> @@ -12,6 +12,6 @@
> # end_generated_IBM_copyright_prolog                               #
> 
> VPATH                                  = @abs_srcdir@
> -SUBDIRS                                  = mpi spi mpid
> +SUBDIRS                                  = mpi spi mpid dcmf
> TESTS                                  =
> include @abs_top_builddir@/Make.rules
> diff --git a/sys/tests/perf/dcmf/CS.c b/sys/tests/perf/dcmf/CS.c
> new file mode 100644
> index 0000000..080f5df
> --- /dev/null
> +++ b/sys/tests/perf/dcmf/CS.c
> @@ -0,0 +1,64 @@
> +/* begin_generated_IBM_copyright_prolog                             */
> +/*                                                                  */
> +/* ---------------------------------------------------------------- */
> +/* (C)Copyright IBM Corp.  2007, 2008                               */
> +/* IBM CPL License                                                  */
> +/* ---------------------------------------------------------------- */
> +/*                                                                  */
> +/* end_generated_IBM_copyright_prolog                               */
> +/**
> + * \file perf/dcmf/CS.c
> + * \brief Test the performance of the low-level critical-section functions
> + */
> +
> +
> +#include <tests.h>
> +#define NUM 10000
> +DCMF_Configure_t config;
> +
> +
> +double time_CS(uint32_t x)
> +{
> +  uint64_t start, stop;
> +  uint32_t i;
> +
> +  start = DCMF_Timebase();
> +  for (i=0; i<x; ++i) {
> +    DCMF_CriticalSection_enter(0);
> +    DCMF_CriticalSection_exit(0);
> +  }
> +  stop  = DCMF_Timebase();
> +
> +  return (double)(stop-start)/(double)x;
> +}
> +
> +
> +#define time_run(c) time_run_long(c, #c)
> +void time_run_long(DCMF_Thread thread_level, char* thread_string)
> +{
> +  double time;
> +  DCMF_Result rc;
> +
> +  config.thread_level = thread_level;
> +  rc = DCMF_Messager_configure (&config, &config);
> +  assert(rc == DCMF_SUCCESS);
> +  assert(config.thread_level == thread_level);
> +  time = time_CS(NUM);
> +  printf("%s: Called Enter/Exit %u times at %g cycles each.\n",
> thread_string, NUM, time);
> +}
> +
> +
> +int main()
> +{
> +  config.interrupts = DCMF_INTERRUPTS_OFF;
> +
> +  MPI_INIT;
> +
> +  time_run(DCMF_THREAD_SINGLE);
> +  time_run(DCMF_THREAD_FUNNELED);
> +  time_run(DCMF_THREAD_SERIALIZED);
> +  time_run(DCMF_THREAD_MULTIPLE);
> +
> +  MPI_FINALIZE;
> +  return (0);
> +}
> diff --git a/sys/tests/perf/Makefile.in b/sys/tests/perf/dcmf/Makefile.in
> similarity index 96%
> copy from sys/tests/perf/Makefile.in
> copy to sys/tests/perf/dcmf/Makefile.in
> index 22a7ac7..4c474b6 100644
> --- a/sys/tests/perf/Makefile.in
> +++ b/sys/tests/perf/dcmf/Makefile.in
> @@ -12,6 +12,6 @@
> # end_generated_IBM_copyright_prolog                               #
> 
> VPATH                                  = @abs_srcdir@
> -SUBDIRS                                  = mpi spi mpid
> -TESTS                                  =
> +SUBDIRS                                  =
> +TESTS                                  = CS.c
> include @abs_top_builddir@/Make.rules
> --
> 1.5.4
> 
> _______________________________________________
> dcmf mailing list
> dcmf at lists.anl-external.org
> http://lists.anl-external.org/cgi-bin/mailman/listinfo/dcmf
> 
> 
> _______________________________________________
> dcmf mailing list
> dcmf at lists.anl-external.org
> http://lists.anl-external.org/cgi-bin/mailman/listinfo/dcmf
> 

-- 
Pavan Balaji
http://www.mcs.anl.gov/~balaji



More information about the dcmf mailing list