[Llvm-bgq-discuss] Details behind MPI wrapper for bgclang++

Hal Finkel hfinkel at anl.gov
Fri Mar 1 15:47:10 CST 2013


----- Original Message -----
> From: "Jeff Hammond" <jeff.science at gmail.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "Jack Poulson" <jack.poulson at gmail.com>, "Jeff Hammond" <jhammond at alcf.anl.gov>,
> llvm-bgq-discuss at lists.alcf.anl.gov
> Sent: Friday, March 1, 2013 3:43:04 PM
> Subject: Re: [Llvm-bgq-discuss] Details behind MPI wrapper for bgclang++
> 
> I can't think of a PAMI issue here but I have no experience with
> dynamic linking on BGQ.

I was just wondering how reproducible addresses returned from malloc() are when running the same problem.

 -Hal

> 
> Sent from my iPhone
> 
> On Mar 1, 2013, at 4:39 PM, Hal Finkel <hfinkel at anl.gov> wrote:
> 
> > ----- Original Message -----
> >> From: "Jack Poulson" <jack.poulson at gmail.com>
> >> To: "Hal Finkel" <hfinkel at anl.gov>
> >> Cc: "Jeff Hammond" <jhammond at alcf.anl.gov>,
> >> llvm-bgq-discuss at lists.alcf.anl.gov
> >> Sent: Friday, March 1, 2013 3:07:17 PM
> >> Subject: Re: [Llvm-bgq-discuss] Details behind MPI wrapper for
> >> bgclang++
> >>
> >> On Fri, Mar 1, 2013 at 1:03 PM, Jack Poulson <
> >> jack.poulson at gmail.com
> >>> wrote:
> >>
> >>
> >>
> >>
> >> On Fri, Mar 1, 2013 at 12:43 PM, Hal Finkel < hfinkel at anl.gov >
> >> wrote:
> >>
> >>
> >>
> >>
> >> The lightweight core files are really text files, I looked at the
> >> line:
> >> While executing instruction at..........0x000000000100c7c4
> >>
> >> Then I ran powerpc64-bgq-linux-objdump -C -d Backproj-2d and
> >> looked
> >> at the assembly around address 100c7c4 (if you search for it in
> >> the
> >> file, note that objdump may omit the leading 0s in the address).
> >>
> >>
> >>
> >>
> >> Thanks!
> >>
> >>
> >>
> >>
> >> Can you try compiling/linking with
> >> /home/projects/llvm/r175919-20130222/bin/bgclang++ instead of the
> >> default one; this is a newer build and I'd like to see if it still
> >> has whatever bug is yielding this miscompile.
> >>
> >>
> >> Strangely enough, my executable ran correctly with the new version
> >> of
> >> LLVM (and passed my accuracy tests). I'm rerunning it again right
> >> now to help rule out whether or not that was a fluke.
> >>
> >> Any ideas as to what might have been the major change in the new
> >> release?
> >>
> >>
> >>
> >> Sigh. It was a fluke.
> >
> > Hrmm... this being something that sometimes works is interesting.
> > Are there any sources of non-determinism here? [I suppose that
> > running on different partitions could cause PAMI to malloc memory
> > differently; Jeff?]
> >
> >>
> >> Perhaps we should take this offline to avoid spamming everyone
> >> else
> >> on the list? I will try linking statically next.
> >
> > Let's see if static linking "fixes" it; the list would certainly
> > like to know that ;)
> >
> > -Hal
> >
> >>
> >> Jack
> >>
> >>
> 


More information about the llvm-bgq-discuss mailing list