[Llvm-bgq-discuss] clang fatal error compiling GROMACS for BlueGene/Q

Hal Finkel hfinkel at anl.gov
Sun Sep 29 13:24:25 CDT 2013


----- Original Message -----
> 
> 
> Hi all,
> 
> 
> I'm the development manager for GROMACS, which will offer new SIMD
> support for BlueGene/Q in its impending 4.6.4 release. Following
> some off-list discussion with Jeff Hammond and Hal Finkel, I was
> happy to explore compiling with clang for BlueGene/Q. Today I tried
> the version installed on JUQUEEN (r190771-20130914), as I had
> trouble logging into Vesta (support request lodged).
> 
> 
> In debug mode, everything went great. clang even warned about some
> MPI_Alltoall calls that could have had some explicit pointer casts
> to reassure the reader, which I've now patched.
> 
> 
> I even used qpxmath.h for a small handful of SIMD trig functions we'd
> want - that worked perfectly.
> 
> 
> In release mode, there was a fatal error from clang when compiling
> the "plain C" version of the code for which I've now written SIMD
> kernels. This kernel is compiled and built into mdrun as a fallback.
> My guess would be that auto-vectorization is choking, but hopefully
> you guys are better judges of that than me! I'm happy to pass this
> upstream to LLVM if that's the correct place for this report. The .c
> and .sh files to reproduce the issue can be found at

Thanks for the bug report! This is an error in the backend (although it certainly could be the autovectorization that is exposing it). I'll fix this soon.

> 
> 
> https://docs.google.com/file/d/0B0H2SbsMc3_qTnVvcTI1OTNFMFE/edit?usp=sharing
> https://docs.google.com/file/d/0B0H2SbsMc3_qenZBX05KSEg1TnM/edit?usp=sharing
> 
> 
> The crash trace follows:
> 
> 
> 
> clang:
> /gpfs/vesta-home/hfinkel/rpmbuild/BUILD/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp:630:
> llvm::SDValue<unnamed>::DAGCombiner::CombineTo(llvm::SDNode*, const
> llvm::SDValue*, unsigned int, bool): Assertion `N->getNumValues() ==
> NumTo && "Broken CombineTo call!"' failed.
> 0 libLLVM-3.4svn.so 0x00000fff7ec34a9c
> llvm::sys::PrintStackTrace(_IO_FILE*) + 4281424836
> 1 libLLVM-3.4svn.so 0x00000fff7ec34d00
> 2 libLLVM-3.4svn.so 0x00000fff7ec35ba4
> 3 0x00000fff7f980418 __kernel_sigtramp_rt64 + 0
> 4 libc.so.6 0x00000080c3766ef8 abort + 4293479848
> 5 libc.so.6 0x00000080c375b98c
> 6 libc.so.6 0x00000080c375baa4 __assert_fail + 4293437492
> 7 libLLVM-3.4svn.so 0x00000fff7ea0a94c
> 8 libLLVM-3.4svn.so 0x00000fff7ea0adfc
> 9 libLLVM-3.4svn.so 0x00000fff7ea2de20
> 10 libLLVM-3.4svn.so 0x00000fff7ea43554
> 11 libLLVM-3.4svn.so 0x00000fff7ea46ecc
> 12 libLLVM-3.4svn.so 0x00000fff7ea49c70
> llvm::SelectionDAG::Combine(llvm::CombineLevel,
> llvm::AliasAnalysis&, llvm::CodeGenOpt::Level) + 4279456680
> 13 libLLVM-3.4svn.so 0x00000fff7eb8fba8
> llvm::SelectionDAGISel::CodeGenAndEmitDAG() + 4280770368
> 14 libLLVM-3.4svn.so 0x00000fff7eb909f8
> llvm::SelectionDAGISel::SelectBasicBlock(llvm::ilist_iterator<llvm::Instruction
> const>, llvm::ilist_iterator<llvm::Instruction const>, bool&) +
> 4280774016
> 15 libLLVM-3.4svn.so 0x00000fff7eb92dec
> llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&)
> + 4280783188
> 16 libLLVM-3.4svn.so 0x00000fff7eb93fbc
> llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&)
> + 4280787732
> 17 libLLVM-3.4svn.so 0x00000fff7e84d9c8
> 18 libLLVM-3.4svn.so 0x00000fff7e1e26cc
> llvm::MachineFunctionPass::runOnFunction(llvm::Function&) +
> 4270867196
> 19 libLLVM-3.4svn.so 0x00000fff7e4529b8
> llvm::FPPassManager::runOnFunction(llvm::Function&) + 4273352968
> 20 libLLVM-3.4svn.so 0x00000fff7e452afc
> llvm::FPPassManager::runOnModule(llvm::Module&) + 4273353276
> 21 libLLVM-3.4svn.so 0x00000fff7e4522bc
> llvm::MPPassManager::runOnModule(llvm::Module&) + 4273351228
> 22 libLLVM-3.4svn.so 0x00000fff7e4525e4
> llvm::PassManagerImpl::run(llvm::Module&) + 4273352020
> 23 libLLVM-3.4svn.so 0x00000fff7e4526f4
> llvm::PassManager::run(llvm::Module&) + 4273352276
> 24 clang 0x00000000103ae874
> 25 clang 0x00000000103af7f8
> clang::EmitBackendOutput(clang::DiagnosticsEngine&,
> clang::CodeGenOptions const&, clang::TargetOptions const&,
> clang::LangOptions const&, llvm::Module*, clang::BackendAction,
> llvm::raw_ostream*) + 4272665128
> 
> 26 clang 0x00000000103ab4a4
> 27 clang 0x000000001059f230 clang::ParseAST(clang::Sema&, bool, bool)
> + 4274649152
> 28 clang 0x00000000101e4b64 clang::ASTFrontendAction::ExecuteAction()
> + 4270836484
> 29 clang 0x00000000103a9b00 clang::CodeGenAction::ExecuteAction() +
> 4272641808
> 30 clang 0x00000000101e4fb4 clang::FrontendAction::Execute() +
> 4270837524
> 31 clang 0x00000000101be154
> clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) +
> 4270679924
> 32 clang 0x000000001019f894
> clang::ExecuteCompilerInvocation(clang::CompilerInstance*) +
> 4270560804
> 33 clang 0x00000000101959d8 cc1_main(char const**, char const**, char
> const*, void*) + 4270520648
> 34 clang 0x000000001019d540 main + 4270551792
> 35 libc.so.6 0x00000080c374bcf8
> 36 libc.so.6 0x00000080c374bef0 __libc_start_main + 4293374496
> Stack dump:
> 0. Program arguments:
> /usr/local/bg_soft/clang/llvm.r190771/r190771-20130914/bin/clang
> -cc1 -fopenmp -triple powerpc64-bgq-linux -S -disable-free
> -main-file-name nbnxn_kernel_ref.c -static-define -mrelocation-model
> static -mdisable-fp-elim -ffp-contract=fast -mconstructor-aliases
> -target-cpu a2q -target-linker-version 2.20.51.0.2 -coverage-file
> /tmp/nbnxn_kernel_ref-bb4750.s -resource-dir
> /usr/local/bg_soft/clang/llvm.r190771/r190771-20130914/bin/../lib/clang/3.4
> -D __bgclang__=1 -D __bgclang_version__="r000000-00000000" -D
> HAVE_CONFIG_H -D md_EXPORTS -D NDEBUG -I
> /bgsys/local/clang/llvm.r190771/r190771-20130914/sleef/include -I
> /bgsys/local/clang/llvm.r190771/r190771-20130914/omp/include -I
> /bgsys/drivers/V1R2M1/ppc64/comm/include -I
> /bgsys/drivers/V1R2M1/ppc64/comm/lib/gnu -I
> /bgsys/drivers/V1R2M1/ppc64 -I
> /bgsys/drivers/V1R2M1/ppc64/comm/sys/include -I
> /bgsys/drivers/V1R2M1/ppc64/spi/include -I
> /bgsys/drivers/V1R2M1/ppc64/spi/include/kernel/cnk -I
> /homeb/zdv518/zdv518/git/bluegene-dev-r46/build-cmake-clang/src -I
> /homeb/zdv518/zdv518/git/bluegene-dev-r46/build-cmake-clang/include
> -I /homeb/zdv518/zdv518/git/bluegene-dev-r46/include -I
> /homeb/zdv518/zdv518/progs/bgsys-clang/include -I
> /bgsys/drivers/V1R2M1/ppc64/comm/include -internal-isystem
> /usr/local/include -internal-isystem
> /usr/local/bg_soft/clang/llvm.r190771/r190771-20130914/bin/../lib/clang/3.4/include
> -internal-externc-isystem /include -internal-externc-isystem
> /usr/include -O3 -Wall -Wno-unused -Wunused-value
> -fno-dwarf-directory-asm -fdebug-compilation-dir
> /homeb/zdv518/zdv518/git/bluegene-dev-r46/build-cmake-clang/src/mdlib
> -ferror-limit 19 -fmessage-length 108 -mstackrealign
> -fno-signed-char -fobjc-runtime=gcc
> -fobjc-default-synthesize-properties -fdiagnostics-show-option
> -fcolor-diagnostics -vectorize-loops -vectorize-slp -isystem
> /bgsys/drivers/V1R2M1/ppc64/gnu-linux/powerpc64-bgq-linux/sys-include
> -mllvm -optimize-regalloc -mllvm -fast-isel=0 -o
> /tmp/nbnxn_kernel_ref-bb4750.s -x c
> /homeb/zdv518/zdv518/git/bluegene-dev-r46/src/mdlib/nbnxn_kernels/nbnxn_kernel_ref.c
> 1. <eof> parser at end of file
> 2. Code generation
> 3. Running pass 'Function Pass Manager' on module
> '/homeb/zdv518/zdv518/git/bluegene-dev-r46/src/mdlib/nbnxn_kernels/nbnxn_kernel_ref.c'.
> 4. Running pass 'PowerPC DAG->DAG Pattern Instruction Selection' on
> function '@nbnxn_kernel_ref_rf_noener'
> 
> clang: error: unable to execute command: Aborted (core dumped)
> clang: error: clang frontend command failed due to signal (use -v to
> see invocation)
> clang version 3.4 (trunk)
> Target: powerpc64-bgq-linux
> Thread model: posix
> clang: note: diagnostic msg: PLEASE submit a bug report to
> http://llvm.org/bugs/ and include the crash backtrace, preprocessed
> source, and associated run script.
> clang: note: diagnostic msg:
> ********************
> 
> 
> PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
> Preprocessed source(s) and associated run script(s) are located at:
> clang: note: diagnostic msg: /tmp/nbnxn_kernel_ref-96ac7c.c
> clang: note: diagnostic msg: /tmp/nbnxn_kernel_ref-96ac7c.sh
> clang: note: diagnostic msg:
> 
> 
> ********************
> 
> 
> I tried to check that the .sh file would reproduce the above, but it
> failed with
> 
> 
> 
> In file included from <built-in>:167:
> <command line>:6:10: fatal error: 'qpxintrin.h' file not found
> #include "qpxintrin.h"

Ah, I keep forgetting to add this to my TODO list to fix. Thanks for reminding me :)

> 
> Hope that is useful - do let me know if I can be of further help!

Quite useful.

 -Hal

> 
> 
> Cheers,
> 
> 
> Mark
> _______________________________________________
> llvm-bgq-discuss mailing list
> llvm-bgq-discuss at lists.alcf.anl.gov
> https://lists.alcf.anl.gov/mailman/listinfo/llvm-bgq-discuss
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory


More information about the llvm-bgq-discuss mailing list