[Llvm-bgq-discuss] clang fatal error compiling GROMACS for BlueGene/Q

Mark Abraham mark.abraham at scilifelab.se
Sun Sep 29 14:16:51 CDT 2013


Hi,

Thanks, Hank. The good news is I succeeded at dropping in the debug version
of that object file to the release build. Compile times were extremely
pleasing. make mdrun -j 8 took the following numbers of seconds at -O3

XLC   debug   99
XLC   release 253
clang debug   46
clang release 57

Now, off to look at some more important timing measurements ;-)

Mark


On Sun, Sep 29, 2013 at 8:24 PM, Hal Finkel <hfinkel at anl.gov> wrote:

> ----- Original Message -----
> >
> >
> > Hi all,
> >
> >
> > I'm the development manager for GROMACS, which will offer new SIMD
> > support for BlueGene/Q in its impending 4.6.4 release. Following
> > some off-list discussion with Jeff Hammond and Hal Finkel, I was
> > happy to explore compiling with clang for BlueGene/Q. Today I tried
> > the version installed on JUQUEEN (r190771-20130914), as I had
> > trouble logging into Vesta (support request lodged).
> >
> >
> > In debug mode, everything went great. clang even warned about some
> > MPI_Alltoall calls that could have had some explicit pointer casts
> > to reassure the reader, which I've now patched.
> >
> >
> > I even used qpxmath.h for a small handful of SIMD trig functions we'd
> > want - that worked perfectly.
> >
> >
> > In release mode, there was a fatal error from clang when compiling
> > the "plain C" version of the code for which I've now written SIMD
> > kernels. This kernel is compiled and built into mdrun as a fallback.
> > My guess would be that auto-vectorization is choking, but hopefully
> > you guys are better judges of that than me! I'm happy to pass this
> > upstream to LLVM if that's the correct place for this report. The .c
> > and .sh files to reproduce the issue can be found at
>
> Thanks for the bug report! This is an error in the backend (although it
> certainly could be the autovectorization that is exposing it). I'll fix
> this soon.
>
> >
> >
> >
> https://docs.google.com/file/d/0B0H2SbsMc3_qTnVvcTI1OTNFMFE/edit?usp=sharing
> >
> https://docs.google.com/file/d/0B0H2SbsMc3_qenZBX05KSEg1TnM/edit?usp=sharing
> >
> >
> > The crash trace follows:
> >
> >
> >
> > clang:
> >
> /gpfs/vesta-home/hfinkel/rpmbuild/BUILD/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp:630:
> > llvm::SDValue<unnamed>::DAGCombiner::CombineTo(llvm::SDNode*, const
> > llvm::SDValue*, unsigned int, bool): Assertion `N->getNumValues() ==
> > NumTo && "Broken CombineTo call!"' failed.
> > 0 libLLVM-3.4svn.so 0x00000fff7ec34a9c
> > llvm::sys::PrintStackTrace(_IO_FILE*) + 4281424836
> > 1 libLLVM-3.4svn.so 0x00000fff7ec34d00
> > 2 libLLVM-3.4svn.so 0x00000fff7ec35ba4
> > 3 0x00000fff7f980418 __kernel_sigtramp_rt64 + 0
> > 4 libc.so.6 0x00000080c3766ef8 abort + 4293479848
> > 5 libc.so.6 0x00000080c375b98c
> > 6 libc.so.6 0x00000080c375baa4 __assert_fail + 4293437492
> > 7 libLLVM-3.4svn.so 0x00000fff7ea0a94c
> > 8 libLLVM-3.4svn.so 0x00000fff7ea0adfc
> > 9 libLLVM-3.4svn.so 0x00000fff7ea2de20
> > 10 libLLVM-3.4svn.so 0x00000fff7ea43554
> > 11 libLLVM-3.4svn.so 0x00000fff7ea46ecc
> > 12 libLLVM-3.4svn.so 0x00000fff7ea49c70
> > llvm::SelectionDAG::Combine(llvm::CombineLevel,
> > llvm::AliasAnalysis&, llvm::CodeGenOpt::Level) + 4279456680
> > 13 libLLVM-3.4svn.so 0x00000fff7eb8fba8
> > llvm::SelectionDAGISel::CodeGenAndEmitDAG() + 4280770368
> > 14 libLLVM-3.4svn.so 0x00000fff7eb909f8
> >
> llvm::SelectionDAGISel::SelectBasicBlock(llvm::ilist_iterator<llvm::Instruction
> > const>, llvm::ilist_iterator<llvm::Instruction const>, bool&) +
> > 4280774016
> > 15 libLLVM-3.4svn.so 0x00000fff7eb92dec
> > llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&)
> > + 4280783188
> > 16 libLLVM-3.4svn.so 0x00000fff7eb93fbc
> > llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&)
> > + 4280787732
> > 17 libLLVM-3.4svn.so 0x00000fff7e84d9c8
> > 18 libLLVM-3.4svn.so 0x00000fff7e1e26cc
> > llvm::MachineFunctionPass::runOnFunction(llvm::Function&) +
> > 4270867196
> > 19 libLLVM-3.4svn.so 0x00000fff7e4529b8
> > llvm::FPPassManager::runOnFunction(llvm::Function&) + 4273352968
> > 20 libLLVM-3.4svn.so 0x00000fff7e452afc
> > llvm::FPPassManager::runOnModule(llvm::Module&) + 4273353276
> > 21 libLLVM-3.4svn.so 0x00000fff7e4522bc
> > llvm::MPPassManager::runOnModule(llvm::Module&) + 4273351228
> > 22 libLLVM-3.4svn.so 0x00000fff7e4525e4
> > llvm::PassManagerImpl::run(llvm::Module&) + 4273352020
> > 23 libLLVM-3.4svn.so 0x00000fff7e4526f4
> > llvm::PassManager::run(llvm::Module&) + 4273352276
> > 24 clang 0x00000000103ae874
> > 25 clang 0x00000000103af7f8
> > clang::EmitBackendOutput(clang::DiagnosticsEngine&,
> > clang::CodeGenOptions const&, clang::TargetOptions const&,
> > clang::LangOptions const&, llvm::Module*, clang::BackendAction,
> > llvm::raw_ostream*) + 4272665128
> >
> > 26 clang 0x00000000103ab4a4
> > 27 clang 0x000000001059f230 clang::ParseAST(clang::Sema&, bool, bool)
> > + 4274649152
> > 28 clang 0x00000000101e4b64 clang::ASTFrontendAction::ExecuteAction()
> > + 4270836484
> > 29 clang 0x00000000103a9b00 clang::CodeGenAction::ExecuteAction() +
> > 4272641808
> > 30 clang 0x00000000101e4fb4 clang::FrontendAction::Execute() +
> > 4270837524
> > 31 clang 0x00000000101be154
> > clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) +
> > 4270679924
> > 32 clang 0x000000001019f894
> > clang::ExecuteCompilerInvocation(clang::CompilerInstance*) +
> > 4270560804
> > 33 clang 0x00000000101959d8 cc1_main(char const**, char const**, char
> > const*, void*) + 4270520648
> > 34 clang 0x000000001019d540 main + 4270551792
> > 35 libc.so.6 0x00000080c374bcf8
> > 36 libc.so.6 0x00000080c374bef0 __libc_start_main + 4293374496
> > Stack dump:
> > 0. Program arguments:
> > /usr/local/bg_soft/clang/llvm.r190771/r190771-20130914/bin/clang
> > -cc1 -fopenmp -triple powerpc64-bgq-linux -S -disable-free
> > -main-file-name nbnxn_kernel_ref.c -static-define -mrelocation-model
> > static -mdisable-fp-elim -ffp-contract=fast -mconstructor-aliases
> > -target-cpu a2q -target-linker-version 2.20.51.0.2 -coverage-file
> > /tmp/nbnxn_kernel_ref-bb4750.s -resource-dir
> >
> /usr/local/bg_soft/clang/llvm.r190771/r190771-20130914/bin/../lib/clang/3.4
> > -D __bgclang__=1 -D __bgclang_version__="r000000-00000000" -D
> > HAVE_CONFIG_H -D md_EXPORTS -D NDEBUG -I
> > /bgsys/local/clang/llvm.r190771/r190771-20130914/sleef/include -I
> > /bgsys/local/clang/llvm.r190771/r190771-20130914/omp/include -I
> > /bgsys/drivers/V1R2M1/ppc64/comm/include -I
> > /bgsys/drivers/V1R2M1/ppc64/comm/lib/gnu -I
> > /bgsys/drivers/V1R2M1/ppc64 -I
> > /bgsys/drivers/V1R2M1/ppc64/comm/sys/include -I
> > /bgsys/drivers/V1R2M1/ppc64/spi/include -I
> > /bgsys/drivers/V1R2M1/ppc64/spi/include/kernel/cnk -I
> > /homeb/zdv518/zdv518/git/bluegene-dev-r46/build-cmake-clang/src -I
> > /homeb/zdv518/zdv518/git/bluegene-dev-r46/build-cmake-clang/include
> > -I /homeb/zdv518/zdv518/git/bluegene-dev-r46/include -I
> > /homeb/zdv518/zdv518/progs/bgsys-clang/include -I
> > /bgsys/drivers/V1R2M1/ppc64/comm/include -internal-isystem
> > /usr/local/include -internal-isystem
> >
> /usr/local/bg_soft/clang/llvm.r190771/r190771-20130914/bin/../lib/clang/3.4/include
> > -internal-externc-isystem /include -internal-externc-isystem
> > /usr/include -O3 -Wall -Wno-unused -Wunused-value
> > -fno-dwarf-directory-asm -fdebug-compilation-dir
> > /homeb/zdv518/zdv518/git/bluegene-dev-r46/build-cmake-clang/src/mdlib
> > -ferror-limit 19 -fmessage-length 108 -mstackrealign
> > -fno-signed-char -fobjc-runtime=gcc
> > -fobjc-default-synthesize-properties -fdiagnostics-show-option
> > -fcolor-diagnostics -vectorize-loops -vectorize-slp -isystem
> > /bgsys/drivers/V1R2M1/ppc64/gnu-linux/powerpc64-bgq-linux/sys-include
> > -mllvm -optimize-regalloc -mllvm -fast-isel=0 -o
> > /tmp/nbnxn_kernel_ref-bb4750.s -x c
> >
> /homeb/zdv518/zdv518/git/bluegene-dev-r46/src/mdlib/nbnxn_kernels/nbnxn_kernel_ref.c
> > 1. <eof> parser at end of file
> > 2. Code generation
> > 3. Running pass 'Function Pass Manager' on module
> >
> '/homeb/zdv518/zdv518/git/bluegene-dev-r46/src/mdlib/nbnxn_kernels/nbnxn_kernel_ref.c'.
> > 4. Running pass 'PowerPC DAG->DAG Pattern Instruction Selection' on
> > function '@nbnxn_kernel_ref_rf_noener'
> >
> > clang: error: unable to execute command: Aborted (core dumped)
> > clang: error: clang frontend command failed due to signal (use -v to
> > see invocation)
> > clang version 3.4 (trunk)
> > Target: powerpc64-bgq-linux
> > Thread model: posix
> > clang: note: diagnostic msg: PLEASE submit a bug report to
> > http://llvm.org/bugs/ and include the crash backtrace, preprocessed
> > source, and associated run script.
> > clang: note: diagnostic msg:
> > ********************
> >
> >
> > PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
> > Preprocessed source(s) and associated run script(s) are located at:
> > clang: note: diagnostic msg: /tmp/nbnxn_kernel_ref-96ac7c.c
> > clang: note: diagnostic msg: /tmp/nbnxn_kernel_ref-96ac7c.sh
> > clang: note: diagnostic msg:
> >
> >
> > ********************
> >
> >
> > I tried to check that the .sh file would reproduce the above, but it
> > failed with
> >
> >
> >
> > In file included from <built-in>:167:
> > <command line>:6:10: fatal error: 'qpxintrin.h' file not found
> > #include "qpxintrin.h"
>
> Ah, I keep forgetting to add this to my TODO list to fix. Thanks for
> reminding me :)
>
> >
> > Hope that is useful - do let me know if I can be of further help!
>
> Quite useful.
>
>  -Hal
>
> >
> >
> > Cheers,
> >
> >
> > Mark
> > _______________________________________________
> > llvm-bgq-discuss mailing list
> > llvm-bgq-discuss at lists.alcf.anl.gov
> > https://lists.alcf.anl.gov/mailman/listinfo/llvm-bgq-discuss
> >
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alcf.anl.gov/pipermail/llvm-bgq-discuss/attachments/20130929/69309410/attachment-0001.html>


More information about the llvm-bgq-discuss mailing list