[Llvm-bgq-discuss] New bgclang nighty builds (and other updates)

Sun May 10 10:22:50 CDT 2015

Hi Erik,

If you could send this trivial piece of code, that would be greatly appreciated.

Thanks again,
Hal

----- Original Message -----
> From: "Erik Lindahl" <erik.lindahl at gmail.com>
> To: "Hal Finkel" <hfinkel at anl.gov>, "Mark Abraham" <mark.j.abraham at gmail.com>
> Cc: llvm-bgq-discuss at lists.alcf.anl.gov
> Sent: Sunday, May 10, 2015 9:25:34 AM
> Subject: Re: [Llvm-bgq-discuss] New bgclang nighty builds (and other updates)
> 
> 
> Hi,
> 
> 
> Super-brief summary:
> 
> 
> The bug occurs at high optimization (-O3) with the vec_ld() or
> vec_st() functions when calling the (overloaded) version with
> single-precision arguments (i.e., float pointers).
> 
> 
> This is reproducible even with a trivial piece of code that tries to
> load from single precision memory and then write out the contents of
> the (double precision) vector4doubles by storing to a double
> variable.
> 
> 
> - Everything works fine with the double-precision functions,
> regardless of optimization
> - It works fine with clang in single precision without optimization
> - It works fine with xlc.
> 
> 
> Cheers,
> 
> 
> Erik
> 
> 
> 
> 
> From: Mark Abraham <mark.j.abraham at gmail.com>
> Reply: Mark Abraham <mark.j.abraham at gmail.com>>
> Date: 10 May 2015 at 16:04:51
> To: Hal Finkel <hfinkel at anl.gov>>
> Cc: llvm-bgq-discuss at lists.alcf.anl.gov
> <llvm-bgq-discuss at lists.alcf.anl.gov>> , Erik Lindahl
> <erik.lindahl at gmail.com>>
> Subject: Re: [Llvm-bgq-discuss] New bgclang nighty builds (and other
> updates)
> 
> 
> 
> 
> 
> 
> 
> Hi,
> 
> Bump. Erik Lindahl has also observed this on bgclang on JUQUEEN. Is
> it helpful if I produce a small piece of code that reproduces some
> of the issues on bgclang 3.6?
> 
> 
> 
> Mark
> 
> On Wed, Mar 25, 2015 at 4:16 PM Mark Abraham <
> mark.j.abraham at gmail.com > wrote:
> 
> 
> 
> 
> 
> On Tue, Mar 24, 2015 at 7:29 PM, Hal Finkel < hfinkel at anl.gov >
> wrote:
> 
> 
> ----- Original Message -----
> > From: "Mark Abraham" < mark.j.abraham at gmail.com >
> > To: "Hal Finkel" < hfinkel at anl.gov >
> > Cc: llvm-bgq-discuss at lists.alcf.anl.gov
> > Sent: Tuesday, March 24, 2015 12:35:20 PM
> > Subject: Re: [Llvm-bgq-discuss] New bgclang nighty builds (and
> > other updates)
> > 
> > 
> > Hi Hal,
> > 
> > 
> > Thanks very much for the update & effort.
> > 
> 
> You're very welcome.
> 
> > 
> > I tried out the default bgclang 3.6.0 on vesta, but found a bunch
> > of
> > the GROMACS SIMD-layer unit tests failing. These need correct QPX
> > vector intrinsics available. From memory, things worked fine with
> > bgclang in ~August last year, but I no longer have those results.
> > Is
> > there a simple way I can compile on vesta with older bgclang to see
> > where a problem might lie? Otherwise / depending what I learn, I'll
> > break out a debugger.
> 
> The old builds are still all installed. Just use the MPI wrappers
> from:
> 
> /home/projects/llvm/<whatever>/mpi/bgclang/bin -- I don't know
> exactly which build you were using in August of last year,
> r209570-20140527 maybe?
> 
> 
> 
> 
> 
> 
> Yes, that was it, looking at some cruft in my former build script. In
> any case, the latest GROMACS SIMD single-precision unit tests pass
> on that old compiler version (3.5.0), and many of them fail in a
> release build on the default 3.6.0 bgclang on vesta. For some of the
> cases, it looks like some junk memory gets loaded, somehow. A few
> tests pass, but no theme for passing or failing tests leaps out at
> me. Double precision unit tests are OK, though.
> 
> 
> Naturally, in debug mode most of the tests pass. However, the two I
> looked at in ddt were fine up until they called vec_extract(x, 0)
> e.g. on lines
> https://github.com/gromacs/gromacs/blob/master/src/gromacs/simd/impl_ibm_qpx/impl_ibm_qpx.h#L315
> (where a SIMD vector of 1 2 3 1 was summed to 3, rather than 7) and
> https://github.com/gromacs/gromacs/blob/master/src/gromacs/simd/impl_ibm_qpx/impl_ibm_qpx.h#L462
> (where a SIMD vector dot product of the first 3 lanes returns a
> garbage answer). So maybe that is a productive lead?
> 
> 
> I couldn't inspect the disassembly in ddt, so I'm not sure where we
> can take this from here, Hal. Tarball of test results attached.
> 
> 
> Thanks,
> 
> 
> Mark
> 
> 
> 
> 
> 
> 
> 
> -Hal
> 
> 
> 
> > 
> > 
> > Thanks,
> > 
> > 
> > Mark
> > 
> > 
> > 
> > 
> > On Fri, Mar 20, 2015 at 12:15 AM, Hal Finkel < hfinkel at anl.gov >
> > wrote:
> > 
> > 
> > Hello everyone,
> > 
> > First, let me apologize to everyone, this is a few months late...
> > but, hopefully, this will never be a problem again...
> > 
> > I now have a system setup which automatically pulls in upstream
> > changes and tries to merge those with the bgclang-specific patches,
> > and then builds the resulting suite of bgclang RPMs. When this
> > succeeds, the RPMs should be posted automatically to:
> > 
> > http://www.mcs.anl.gov/~hfinkel/bgclang/
> > (note that installing a build from here now also requires both the
> > 'stage1' and 'stage2' RPMs as well)
> > 
> > The first such nightly build, r232720-20150319, has been posted to
> > that page.
> > 
> > And, for the curious, the local repositories used for version
> > control
> > are now mirrored to github:
> > 
> > https://github.com/hfinkel/clang-bgq
> > https://github.com/hfinkel/llvm-bgq
> > https://github.com/hfinkel/bgclang-aux
> > https://github.com/hfinkel/compiler-rt-bgq
> > https://github.com/hfinkel/libcxx-bgq
> > https://github.com/hfinkel/openmp-bgq
> > https://github.com/hfinkel/sleef-bgq
> > 
> > Compared to the latest "released" version (r220548-20141024), the
> > most-recent nightly build does show some performance regressions,
> > and there are a few things I've not even tested yet (LTO, ASan,
> > etc.), but it also contains a number of bug fixes and improvements,
> > so feel free to test on your applications.
> > 
> > One particular noteworthy improvement is that our OpenMP runtime
> > library now has affinity support enabled. This means that all of
> > the
> > OpenMP 4 affinity features should work, and also that the default
> > thread<->core bindings are now sensible.
> > 
> > The bgclang wrapper script no longer disables 'fast-isel'
> > instruction
> > selection at -O0, so your debug builds should now be faster too.
> > Also, the automated vectorization of math functions using our SLEEF
> > library adaptation is controlled using the new -fveclib flag (so
> > the
> > wrapper script contains -fveclib=SLEEF, and you can add
> > -fveclib=none to turn it off if desired for whatever reason).
> > 
> > Also, the core QPX support has been contributed upstream (although
> > not yet the Clang-level intrinsics support); so if you're using
> > LLVM
> > as a library, and want to just build from upstream sources instead
> > of depending on the bgclang builds, that is now possible.
> > 
> > Thanks again everyone, and please let me know if you experience any
> > difficulties,
> > Hal
> > 
> > --
> > Hal Finkel
> > Assistant Computational Scientist
> > Leadership Computing Facility
> > Argonne National Laboratory
> > _______________________________________________
> > llvm-bgq-discuss mailing list
> > llvm-bgq-discuss at lists.alcf.anl.gov
> > https://lists.alcf.anl.gov/mailman/listinfo/llvm-bgq-discuss
> > 
> > 
> 
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> 
> 
> --
> Erik Lindahl < erik.lindahl at gmail.com >
> Professor of Biophysics, Dept. Biochemistry & Biophysics, Stockholm
> University
> Professor of Theoretical biophysics, Dept. Theoretical Physics, Royal
> Inst. Technology
> Science for Life Laboratory, Box 1031, 17121 Solna, Sweden

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory