[Llvm-bgq-discuss] Details behind MPI wrapper for bgclang++

Hal Finkel hfinkel at anl.gov
Fri Mar 1 00:15:23 CST 2013


----- Original Message -----
> From: "Jack Poulson" <jack.poulson at gmail.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "Jeff Hammond" <jhammond at alcf.anl.gov>, llvm-bgq-discuss at lists.alcf.anl.gov
> Sent: Thursday, February 28, 2013 9:42:26 PM
> Subject: Re: [Llvm-bgq-discuss] Details behind MPI wrapper for bgclang++
> 
> On Thu, Feb 28, 2013 at 7:36 PM, Hal Finkel < hfinkel at anl.gov >
> wrote:
> 
> 
> 
> 
> ----- Original Message -----
> > From: "Jack Poulson" < jack.poulson at gmail.com >
> > To: "Hal Finkel" < hfinkel at anl.gov >
> > Cc: "Jeff Hammond" < jhammond at alcf.anl.gov >,
> > llvm-bgq-discuss at lists.alcf.anl.gov
> 
> 
> > Sent: Thursday, February 28, 2013 7:40:19 PM
> > Subject: Re: [Llvm-bgq-discuss] Details behind MPI wrapper for
> > bgclang++
> > 
> > On Thu, Feb 28, 2013 at 5:24 PM, Hal Finkel < hfinkel at anl.gov >
> > wrote:
> > 
> > 
> > 
> > 
> > ----- Original Message -----
> > > From: "Jack Poulson" < jack.poulson at gmail.com >
> > > To: "Hal Finkel" < hfinkel at anl.gov >
> > > Cc: "Jeff Hammond" < jhammond at alcf.anl.gov >,
> > > llvm-bgq-discuss at lists.alcf.anl.gov
> > 
> > > Sent: Thursday, February 28, 2013 6:51:08 PM
> > > Subject: Re: [Llvm-bgq-discuss] Details behind MPI wrapper for
> > > bgclang++
> > > 
> > 
> > > Thanks Hal, that helps clear things up a bit. I guess I should be
> > > a
> > > little more clear about what exactly I'm doing. Since CMake's
> > > FindMPI module seems to attempt to link to MPICH incorrectly if I
> > > directly specify the mpic++11, I have to instead manually specify
> > > its behaviour (and thankfully Jeff took care of this a month or
> > > two
> > > ago). One of my link commands generated by CMake looks like this:
> > > 
> > > 
> > > 
> > > /home/projects/llvm/bin/bgclang++ -Wall -std=c++11 -O3
> > > -stdlib=libc++
> > > -L/bgsys/drivers/ppcfloor/comm/gcc/lib
> > > -L/bgsys/drivers/ppcfloor/comm/sys/lib
> > > -L/bgsys/drivers/ppcfloor/spi/lib
> > > CMakeFiles/HypRadon-2d.dir/test/transform/HypRadon-2d.cpp.o -o
> > > bin/transform/HypRadon-2d -rdynamic libcmake-dummy-lib.a
> > > -L/soft/libraries/alcf/current/gcc/LAPACK/lib -llapack
> > > -L/soft/libraries/essl/current/essl/5.1/lib64 -lesslbg
> > > -L/soft/compilers/ibmcmp-nov2012/xlf/bg/14.1/bglib64 -lxlf90_r
> > > -L/soft/compilers/ibmcmp-nov2012/xlsmp/bg/3.1/bglib64 -lxlomp_ser
> > > -L/soft/compilers/ibmcmp-nov2012/xlmass/bg/7.3/bglib64 -lmassv
> > > -lmass -lxlopt -lxlfmath -lxl -lgfortran -lm -lpthread -ldl
> > > -Wl,--allow-multiple-definition -lcxxmpich -lmpich -lopa -lmpl
> > > -ldl
> > > -lpami -lSPI -lSPI_cnk -lpthread -lrt -lstdc++
> > > 
> > > 
> > > Upon typing this, I noticed that both libc++ and libstdc++ are
> > > both
> > > being used (which I assume is bad).
> > 
> > Actually, this is okay (although this is not obvious). When you
> > specify -stdlib=libc++ the clang driver automatically rewrites
> > -lstdc++ to some system-specific set of libraries necessary to link
> > with libc++. In our case, this is -lc++ (and -lrt -lpthread
> > -lstdc++
> > when statically linking). As a result, using -lstdc++ is fine here.
> > You can verify this by passing -v and examining the linking command
> > line.
> > 
> > I should also note that we have a *special* libc++ install which is
> > partially based on libstdc++ so that we can link statically with
> > PAMI without symbol definition conflicts. Everything *should* work,
> > but I've only done some limited testing.
> > 
> > So using the command-line above succeeds but produces an executable
> > that crashes?
> > 
> > 
> > 
> > 
> > 
> > 
> > Hi Hal,
> > 
> > 
> > Thanks, that makes a lot of sense! The executable Backproj-2d in
> > /home/poulson/dist-butterfly/build/clang/bin/transform produced
> > core.{0,6,8,10,11}. Please let me know if you do not have access
> > for
> > some reason, and thank you again for your help.
> 
> Interesting; so the backtrace is messed up, but the problem seems
> legitimate. The code is ending up in steady_clock::now(), which uses
> clock_gettime(CLOCK_MONOTONIC, ...), which is not supported on the
> BG/Q compute nodes. I'll need to change it to use CLOCK_REALTIME
> instead; I'll let you know when to retry.
> 
> Thanks again,
> Hal
> 
> 
> 
> 
> Thanks Hal! I was thrown off by the fact that the trace contained
> MPI_Gather and the program had failed before it should have reached
> any MPI_Gather calls. I very much appreciate the help.

Not a problem! Thanks for being a beta tester :) I've updated the installed libc++ libraries to use CLOCK_REALTIME instead of CLOCK_MONOTONIC. Please try again.

 -Hal

> 
> 
> Jack


More information about the llvm-bgq-discuss mailing list