[Llvm-bgq-discuss] clang fatal error compiling GROMACS for BlueGene/Q

Mark Abraham mark.abraham at scilifelab.se
Sun Sep 29 13:05:48 CDT 2013


Hi all,

I'm the development manager for GROMACS, which will offer new SIMD support
for BlueGene/Q in its impending 4.6.4 release. Following some off-list
discussion with Jeff Hammond and Hal Finkel, I was happy to explore
compiling with clang for BlueGene/Q. Today I tried the version installed on
JUQUEEN (r190771-20130914), as I had trouble logging into Vesta (support
request lodged).

In debug mode, everything went great. clang even warned about some
MPI_Alltoall calls that could have had some explicit pointer casts to
reassure the reader, which I've now patched.

I even used qpxmath.h for a small handful of SIMD trig functions we'd want
- that worked perfectly.

In release mode, there was a fatal error from clang when compiling the
"plain C" version of the code for which I've now written SIMD kernels. This
kernel is compiled and built into mdrun as a fallback. My guess would be
that auto-vectorization is choking, but hopefully you guys are better
judges of that than me! I'm happy to pass this upstream to LLVM if that's
the correct place for this report. The .c and .sh files to reproduce the
issue can be found at

https://docs.google.com/file/d/0B0H2SbsMc3_qTnVvcTI1OTNFMFE/edit?usp=sharing
https://docs.google.com/file/d/0B0H2SbsMc3_qenZBX05KSEg1TnM/edit?usp=sharing

The crash trace follows:

clang:
/gpfs/vesta-home/hfinkel/rpmbuild/BUILD/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp:630:
llvm::SDValue<unnamed>::DAGCombiner::CombineTo(llvm::SDNode*, const
llvm::SDValue*, unsigned int, bool): Assertion `N->getNumValues() == NumTo
&& "Broken CombineTo call!"' failed.
0  libLLVM-3.4svn.so <http://libllvm-3.4svn.so/> 0x00000fff7ec34a9c
llvm::sys::PrintStackTrace(_IO_FILE*) + 4281424836
1  libLLVM-3.4svn.so <http://libllvm-3.4svn.so/> 0x00000fff7ec34d00
2  libLLVM-3.4svn.so <http://libllvm-3.4svn.so/> 0x00000fff7ec35ba4
3                    0x00000fff7f980418 __kernel_sigtramp_rt64 + 0
4  libc.so.6         0x00000080c3766ef8 abort + 4293479848
5  libc.so.6         0x00000080c375b98c
6  libc.so.6         0x00000080c375baa4 __assert_fail + 4293437492
7  libLLVM-3.4svn.so <http://libllvm-3.4svn.so/> 0x00000fff7ea0a94c
8  libLLVM-3.4svn.so <http://libllvm-3.4svn.so/> 0x00000fff7ea0adfc
9  libLLVM-3.4svn.so <http://libllvm-3.4svn.so/> 0x00000fff7ea2de20
10 libLLVM-3.4svn.so <http://libllvm-3.4svn.so/> 0x00000fff7ea43554
11 libLLVM-3.4svn.so <http://libllvm-3.4svn.so/> 0x00000fff7ea46ecc
12 libLLVM-3.4svn.so <http://libllvm-3.4svn.so/> 0x00000fff7ea49c70
llvm::SelectionDAG::Combine(llvm::CombineLevel, llvm::AliasAnalysis&,
llvm::CodeGenOpt::Level) + 4279456680
13 libLLVM-3.4svn.so <http://libllvm-3.4svn.so/> 0x00000fff7eb8fba8
llvm::SelectionDAGISel::CodeGenAndEmitDAG() + 4280770368
14 libLLVM-3.4svn.so <http://libllvm-3.4svn.so/> 0x00000fff7eb909f8
llvm::SelectionDAGISel::SelectBasicBlock(llvm::ilist_iterator<llvm::Instruction
const>, llvm::ilist_iterator<llvm::Instruction const>, bool&) + 4280774016
15 libLLVM-3.4svn.so <http://libllvm-3.4svn.so/> 0x00000fff7eb92dec
llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&) +
4280783188
16 libLLVM-3.4svn.so <http://libllvm-3.4svn.so/> 0x00000fff7eb93fbc
llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&) +
4280787732
17 libLLVM-3.4svn.so <http://libllvm-3.4svn.so/> 0x00000fff7e84d9c8
18 libLLVM-3.4svn.so <http://libllvm-3.4svn.so/> 0x00000fff7e1e26cc
llvm::MachineFunctionPass::runOnFunction(llvm::Function&) + 4270867196
19 libLLVM-3.4svn.so <http://libllvm-3.4svn.so/> 0x00000fff7e4529b8
llvm::FPPassManager::runOnFunction(llvm::Function&) + 4273352968
20 libLLVM-3.4svn.so <http://libllvm-3.4svn.so/> 0x00000fff7e452afc
llvm::FPPassManager::runOnModule(llvm::Module&) + 4273353276
21 libLLVM-3.4svn.so <http://libllvm-3.4svn.so/> 0x00000fff7e4522bc
llvm::MPPassManager::runOnModule(llvm::Module&) + 4273351228
22 libLLVM-3.4svn.so <http://libllvm-3.4svn.so/> 0x00000fff7e4525e4
llvm::PassManagerImpl::run(llvm::Module&) + 4273352020
23 libLLVM-3.4svn.so <http://libllvm-3.4svn.so/> 0x00000fff7e4526f4
llvm::PassManager::run(llvm::Module&) + 4273352276
24 clang             0x00000000103ae874
25 clang             0x00000000103af7f8
clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::CodeGenOptions
const&, clang::TargetOptions const&, clang::LangOptions const&,
llvm::Module*, clang::BackendAction, llvm::raw_ostream*) + 4272665128
26 clang             0x00000000103ab4a4
27 clang             0x000000001059f230 clang::ParseAST(clang::Sema&, bool,
bool) + 4274649152
28 clang             0x00000000101e4b64
clang::ASTFrontendAction::ExecuteAction() + 4270836484
29 clang             0x00000000103a9b00
clang::CodeGenAction::ExecuteAction() + 4272641808
30 clang             0x00000000101e4fb4 clang::FrontendAction::Execute() +
4270837524
31 clang             0x00000000101be154
clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) + 4270679924
32 clang             0x000000001019f894
clang::ExecuteCompilerInvocation(clang::CompilerInstance*) + 4270560804
33 clang             0x00000000101959d8 cc1_main(char const**, char
const**, char const*, void*) + 4270520648
34 clang             0x000000001019d540 main + 4270551792
35 libc.so.6         0x00000080c374bcf8
36 libc.so.6         0x00000080c374bef0 __libc_start_main + 4293374496
Stack dump:
0. Program arguments:
/usr/local/bg_soft/clang/llvm.r190771/r190771-20130914/bin/clang -cc1
-fopenmp -triple powerpc64-bgq-linux -S -disable-free -main-file-name
nbnxn_kernel_ref.c -static-define -mrelocation-model static
-mdisable-fp-elim -ffp-contract=fast -mconstructor-aliases -target-cpu a2q
-target-linker-version 2.20.51.0.2 -coverage-file
/tmp/nbnxn_kernel_ref-bb4750.s -resource-dir
/usr/local/bg_soft/clang/llvm.r190771/r190771-20130914/bin/../lib/clang/3.4
-D __bgclang__=1 -D __bgclang_version__="r000000-00000000" -D HAVE_CONFIG_H
-D md_EXPORTS -D NDEBUG -I
/bgsys/local/clang/llvm.r190771/r190771-20130914/sleef/include -I
/bgsys/local/clang/llvm.r190771/r190771-20130914/omp/include -I
/bgsys/drivers/V1R2M1/ppc64/comm/include -I
/bgsys/drivers/V1R2M1/ppc64/comm/lib/gnu -I /bgsys/drivers/V1R2M1/ppc64 -I
/bgsys/drivers/V1R2M1/ppc64/comm/sys/include -I
/bgsys/drivers/V1R2M1/ppc64/spi/include -I
/bgsys/drivers/V1R2M1/ppc64/spi/include/kernel/cnk -I
/homeb/zdv518/zdv518/git/bluegene-dev-r46/build-cmake-clang/src -I
/homeb/zdv518/zdv518/git/bluegene-dev-r46/build-cmake-clang/include -I
/homeb/zdv518/zdv518/git/bluegene-dev-r46/include -I
/homeb/zdv518/zdv518/progs/bgsys-clang/include -I
/bgsys/drivers/V1R2M1/ppc64/comm/include -internal-isystem
/usr/local/include -internal-isystem
/usr/local/bg_soft/clang/llvm.r190771/r190771-20130914/bin/../lib/clang/3.4/include
-internal-externc-isystem /include -internal-externc-isystem /usr/include
-O3 -Wall -Wno-unused -Wunused-value -fno-dwarf-directory-asm
-fdebug-compilation-dir
/homeb/zdv518/zdv518/git/bluegene-dev-r46/build-cmake-clang/src/mdlib
-ferror-limit 19 -fmessage-length 108 -mstackrealign -fno-signed-char
-fobjc-runtime=gcc -fobjc-default-synthesize-properties
-fdiagnostics-show-option -fcolor-diagnostics -vectorize-loops
-vectorize-slp -isystem
/bgsys/drivers/V1R2M1/ppc64/gnu-linux/powerpc64-bgq-linux/sys-include
-mllvm -optimize-regalloc -mllvm -fast-isel=0 -o
/tmp/nbnxn_kernel_ref-bb4750.s -x c
/homeb/zdv518/zdv518/git/bluegene-dev-r46/src/mdlib/nbnxn_kernels/nbnxn_kernel_ref.c
1. <eof> parser at end of file
2. Code generation
3. Running pass 'Function Pass Manager' on module
'/homeb/zdv518/zdv518/git/bluegene-dev-r46/src/mdlib/nbnxn_kernels/nbnxn_kernel_ref.c'.
4. Running pass 'PowerPC DAG->DAG Pattern Instruction Selection' on
function '@nbnxn_kernel_ref_rf_noener'
clang: error: unable to execute command: Aborted (core dumped)
clang: error: clang frontend command failed due to signal (use -v to see
invocation)
clang version 3.4 (trunk)
Target: powerpc64-bgq-linux
Thread model: posix
clang: note: diagnostic msg: PLEASE submit a bug report to
http://llvm.org/bugs/ and include the crash backtrace, preprocessed source,
and associated run script.
clang: note: diagnostic msg:
********************

PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang: note: diagnostic msg: /tmp/nbnxn_kernel_ref-96ac7c.c
clang: note: diagnostic msg: /tmp/nbnxn_kernel_ref-96ac7c.sh
clang: note: diagnostic msg:

********************

I tried to check that the .sh file would reproduce the above, but it failed
with

In file included from <built-in>:167:
<command line>:6:10: fatal error: 'qpxintrin.h' file not found
#include "qpxintrin.h"

Hope that is useful - do let me know if I can be of further help!

Cheers,

Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alcf.anl.gov/pipermail/llvm-bgq-discuss/attachments/20130929/828830f4/attachment-0001.html>


More information about the llvm-bgq-discuss mailing list