[Llvm-bgq-discuss] Details behind MPI wrapper for bgclang++

Hal Finkel hfinkel at anl.gov
Fri Mar 1 15:57:06 CST 2013


----- Original Message -----
> From: "Jeff Hammond" <jeff.science at gmail.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "Jack Poulson" <jack.poulson at gmail.com>, "Jeff Hammond" <jhammond at alcf.anl.gov>,
> llvm-bgq-discuss at lists.alcf.anl.gov
> Sent: Friday, March 1, 2013 3:50:01 PM
> Subject: Re: [Llvm-bgq-discuss] Details behind MPI wrapper for bgclang++
> 
> Should be deterministic. I've tested that before (a year ago maybe)
> because of symmetric heap implementations. However, that's with
> static
> linkage.

For some reason I thought that he was linking dynamically, but that may not have been the case (at some point he deleted the executable, but before he did I ran ldd on it and it did not say that it was dynamically linked). Also, the bgclang wrapper script links statically by default.

Jack, is the problem itself deterministic?

 -Hal

> 
> Sent from my iPhone
> 
> On Mar 1, 2013, at 4:47 PM, Hal Finkel <hfinkel at anl.gov> wrote:
> 
> > ----- Original Message -----
> >> From: "Jeff Hammond" <jeff.science at gmail.com>
> >> To: "Hal Finkel" <hfinkel at anl.gov>
> >> Cc: "Jack Poulson" <jack.poulson at gmail.com>, "Jeff Hammond"
> >> <jhammond at alcf.anl.gov>,
> >> llvm-bgq-discuss at lists.alcf.anl.gov
> >> Sent: Friday, March 1, 2013 3:43:04 PM
> >> Subject: Re: [Llvm-bgq-discuss] Details behind MPI wrapper for
> >> bgclang++
> >>
> >> I can't think of a PAMI issue here but I have no experience with
> >> dynamic linking on BGQ.
> >
> > I was just wondering how reproducible addresses returned from
> > malloc() are when running the same problem.
> >
> > -Hal
> >
> >>
> >> Sent from my iPhone
> >>
> >> On Mar 1, 2013, at 4:39 PM, Hal Finkel <hfinkel at anl.gov> wrote:
> >>
> >>> ----- Original Message -----
> >>>> From: "Jack Poulson" <jack.poulson at gmail.com>
> >>>> To: "Hal Finkel" <hfinkel at anl.gov>
> >>>> Cc: "Jeff Hammond" <jhammond at alcf.anl.gov>,
> >>>> llvm-bgq-discuss at lists.alcf.anl.gov
> >>>> Sent: Friday, March 1, 2013 3:07:17 PM
> >>>> Subject: Re: [Llvm-bgq-discuss] Details behind MPI wrapper for
> >>>> bgclang++
> >>>>
> >>>> On Fri, Mar 1, 2013 at 1:03 PM, Jack Poulson <
> >>>> jack.poulson at gmail.com
> >>>>> wrote:
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On Fri, Mar 1, 2013 at 12:43 PM, Hal Finkel < hfinkel at anl.gov >
> >>>> wrote:
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> The lightweight core files are really text files, I looked at
> >>>> the
> >>>> line:
> >>>> While executing instruction at..........0x000000000100c7c4
> >>>>
> >>>> Then I ran powerpc64-bgq-linux-objdump -C -d Backproj-2d and
> >>>> looked
> >>>> at the assembly around address 100c7c4 (if you search for it in
> >>>> the
> >>>> file, note that objdump may omit the leading 0s in the address).
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> Thanks!
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> Can you try compiling/linking with
> >>>> /home/projects/llvm/r175919-20130222/bin/bgclang++ instead of
> >>>> the
> >>>> default one; this is a newer build and I'd like to see if it
> >>>> still
> >>>> has whatever bug is yielding this miscompile.
> >>>>
> >>>>
> >>>> Strangely enough, my executable ran correctly with the new
> >>>> version
> >>>> of
> >>>> LLVM (and passed my accuracy tests). I'm rerunning it again
> >>>> right
> >>>> now to help rule out whether or not that was a fluke.
> >>>>
> >>>> Any ideas as to what might have been the major change in the new
> >>>> release?
> >>>>
> >>>>
> >>>>
> >>>> Sigh. It was a fluke.
> >>>
> >>> Hrmm... this being something that sometimes works is interesting.
> >>> Are there any sources of non-determinism here? [I suppose that
> >>> running on different partitions could cause PAMI to malloc memory
> >>> differently; Jeff?]
> >>>
> >>>>
> >>>> Perhaps we should take this offline to avoid spamming everyone
> >>>> else
> >>>> on the list? I will try linking statically next.
> >>>
> >>> Let's see if static linking "fixes" it; the list would certainly
> >>> like to know that ;)
> >>>
> >>> -Hal
> >>>
> >>>>
> >>>> Jack
> >>>>
> >>>>
> >>
> 


More information about the llvm-bgq-discuss mailing list