[Llvm-bgq-discuss] trouble with latest clang install

Biddiscombe, John A. biddisco at cscs.ch
Fri Feb 14 15:49:26 CST 2014


Thomas

Great detective work. On my setup, the code still crashes, but now in a
different place, I’ll look into it shortly.

I’ll post this, but keep follow-ups off the llvm list as I think we can
clear clang of any blame.

I’m really impressed with the help I’m getting.

Thanks everyone

JB

#0  __cxxabiv1::__cxa_throw (obj=0x31f71b20, tinfo=0xfffb78e1300,
    dest=@0x31e19fe8: 0x31c80680 <hpx::exception::~exception()>)
    at 
/bgsys/drivers/V1R2M1/ppc64/toolchain/gnu/gcc-4.4.6/libstdc++-v3/libsupc++/
eh_throw.cc:71
#1  0x00000fffb661903c in hpx::error_code::error_code (this=0xfffaf5dce78,
err=29, e=...)
    at /gpfs/bbp.cscs.ch/home/biddisco/src/hpx/hpx/exception.hpp:1453
#2  0x00000fffb6618f4c in hpx::exception::get_error_code
(this=0xfffaf5dd6b0)
    at /gpfs/bbp.cscs.ch/home/biddisco/src/hpx/hpx/exception.hpp:503
#3  0x00000fffb65a0e10 in hpx::detail::is_of_lightweight_hpx_category
(e=...)
    at /gpfs/bbp.cscs.ch/home/biddisco/src/hpx/src/exception.cpp:190
#4  0x00000fffb65a04d4 in hpx::detail::get_exception<hpx::exception>
(e=..., func=..., file=..., line=245)
    at /gpfs/bbp.cscs.ch/home/biddisco/src/hpx/src/exception.cpp:198
#5  0x00000fffb65a18e0 in hpx::detail::throw_exception<hpx::exception>
(e=..., func=..., file=..., line=245)
    at /gpfs/bbp.cscs.ch/home/biddisco/src/hpx/hpx/exception.hpp:244
#6  0x00000fffaf34fd28 in
._ZN20performance_counters4sine11get_startupERN3hpx4util15function_nonserIF
vvEEERb ()
   from 
/gpfs/bbp.cscs.ch/home/biddisco/bgas/build/hpx/lib/hpx/libsine.so.0.9.8
#7  0x00000fffacf8c09c in invoke_r<bool, bool,
hpx::util::function_nonser<void ()> &, bool &,
hpx::util::function_nonser<void ()> &, bool &> (f=0xfffaf5dd8b8, a0=...,
a1=@0xfffaf5ddcff)
    at 
/gpfs/bbp.cscs.ch/home/biddisco/src/hpx/hpx/util/preprocessed/invoke_5.hpp:
187
#8  hpx::util::detail::vtable<true>::type<bool
(*)(hpx::util::function_nonser<void ()> &, bool &), bool
(hpx::util::function_nonser<void ()> &, bool &), void, void>::invoke(void
**, hpx::util::function_nonser<void ()> &, bool &) (f=0xfffaf5dd8b8,
    a0=..., a1=@0xfffaf5ddcff) at
/gpfs/bbp.cscs.ch/home/biddisco/src/hpx/hpx/util/detail/preprocessed/vtable
_5.hpp:112
#9  0x00000fffa6feb9d4 in operator() (startup_func=...,
pre_startup=@0xfffaf5ddcff)
    at 
/gpfs/bbp.cscs.ch/home/biddisco/src/hpx/hpx/util/detail/preprocessed/functi
on_template_5.hpp:723
#10 
hpx::components::startup_shutdown_provider::sined_startup(hpx::util::functi
on_nonser<void ()> &, bool &) (
    startup_func=..., pre_startup=@0xfffaf5ddcff)
    at 
/gpfs/bbp.cscs.ch/home/biddisco/src/hpx/examples/performance_counters/sine/
sine.cpp:262
#11 0x00000fffa6fecfb4 in
hpx::components::component_startup_shutdown<startup_shutdown_provider::sine
d_startup, 
startup_shutdown_provider::sined_shutdown>::get_startup_function(hpx::util:
:function_nonser<void ()> &, bool &) (this=0x31f6c1f8,
    startup_=..., pre_startup_=@0xfffaf5ddcff)
    at 
/gpfs/bbp.cscs.ch/home/biddisco/src/hpx/hpx/runtime/components/component_st
artup_shutdown.hpp:37
#12 0x00000fffb6ac9bd4 in
hpx::components::server::runtime_support::load_startup_shutdown_functions
(this=0x31e795c0,
    d=..., ec=...) at
/gpfs/bbp.cscs.ch/home/biddisco/src/hpx/src/runtime/components/server/runti
me_support_server.cpp:1077
#13 0x00000fffb6ac92f0 in
hpx::components::server::runtime_support::load_component (this=0x31e795c0,
d=..., ini=...,
    instance=..., component=..., lib=..., prefix=..., agas_client=...,
isdefault=false, isenabled=true, options=...,
    startup_handled=...)
    at 
/gpfs/bbp.cscs.ch/home/biddisco/src/hpx/src/runtime/components/server/runti
me_support_server.cpp:1227
#14 0x00000fffb6ac36c8 in
hpx::components::server::runtime_support::load_components
(this=0x31e795c0, ini=..., prefix=...,
    agas_client=...) at
/gpfs/bbp.cscs.ch/home/biddisco/src/hpx/src/runtime/components/server/runti
me_support_server.cpp:955
#15 0x00000fffb6abd60c in
hpx::components::server::runtime_support::load_components (this=0x31e795c0)
    at 
/gpfs/bbp.cscs.ch/home/biddisco/src/hpx/src/runtime/components/server/runti
me_support_server.cpp:683
#16 0x00000fffb6b30af4 in
operator()<hpx::components::server::runtime_support> (f=0x31f1a488,
    a0=<unknown type in
/gpfs/bbp.cscs.ch/home/biddisco/bgas/build/hpx/lib/hpx/libhpxd.so.0, CU
0x1ecf399, DIE 0x2083640>)
    at 
/gpfs/bbp.cscs.ch/home/biddisco/src/hpx/hpx/runtime/actions/preprocessed/co
nstruct_continuation_functions_5.hpp:105
#17 invoke_r<hpx::threads::thread_state_enum,
hpx::actions::action<hpx::components::server::runtime_support, bool,
hpx::util:




On 14/02/14 21:02, "Thomas Heller" <thom.heller at gmail.com> wrote:

>Hi all,
>
>Ok, I think i tracked it down.
>If my suspicions are correct, the segfault isn't caused by bgclang or hpx
>directly. It looks like parts of boost can't deal with locales correctly
>on 
>John's system. Here is how it happens:
>On a regular BGQ compute node, you don't have interactive access and i
>think 
>no locale information available. However, John's scenario is slightly
>different:
>1) He uses SLURM to get on the nodes (interactively or through batch jobs)
>2) He uses the BGAS nodes directly
>
>Now, using 1) has the implication of a feature of SLURM which makes the
>bash 
>it spawns once the job has enough resources inherit all the environment
>variables the job submission had set (this includes LANG. LC_*). It looks
>like 
>some flavors of linux (especially in the embedded world) have a problem
>with 
>this. I ran into a similar problem when porting HPX to the Xeon Phi.
>Everything was working nicely on our local machine (no job control,
>direct 
>access through ssh etc.). I then moved on to Stampede, when logging into
>one 
>of the Phis directly, everything still worked great. But only until i
>stopped 
>using an interactive mode and started to submit jobs through the batch
>system. 
>Which lead to similar problems John is running into right now ...
>About 2) ... I am not exactly sure how this is related to the problem at
>hand 
>...
>
>Anyway, I was able to reproduce the problem on one of the CNK based
>compute 
>nodes on JUQUEEN by using this jobscript:
># @ job_name = HPX_Hello_World
># @ comment = "HPX Hello World testrun"
># @ error = $(job_name).$(jobid).err
># @ output = $(job_name).$(jobid).out
># @ environment = COPY_ALL
># @ wall_clock_limit = 00:30:00
># @ notification = error
># @ notify_user = thom.heller at gmail.com
># @ job_type = bluegene
># @ bg_size = 32
># @ queue
>
>APP="$HOME/build/hpx/debug/bin/hello_world"
>
>ENVS="LANG=en_US LC_CTYPE=\"en_US\" LC_NUMERIC=\"en_US\" LC_TIME=en_GB
>LC_COLLATE=\"en_US\" LC_MONETARY=\"en_US\" LC_MESSAGES=\"en_US\"
>LC_PAPER=\"en_US\" LC_NAME=\"en_US\" LC_ADDRESS=\"en_US\"
>LC_TELEPHONE=\"en_US\" LC_MEASUREMENT=\"en_US\"
>LC_IDENTIFICATION=\"en_US\"
>LC_ALL=\"en_US\""
>
>runjob --ranks-per-node 1 --exe $APP --args "-t1" --envs $ENVS
>
>Which lead to the exact same error. What I am unsure about though is
>who's 
>fault it is. The stack trace John posted earlier comes out of the static
>section of the binary which initializes some globals out of the boost
>filesystem library. So we have three candidates: 1) Boost.Filesystem 2)
>libc++ 
>3) the libc/posix on the BGAS node.
>
>The solution to this problem is btw to unset all those environment
>variables.
>I commited a fix for HPX for working around this problem which should not
>require to manually unset those environment variables
>(https://github.com/STEllAR-GROUP/hpx/commit/65ce125466ae43e68e19e89b3e50e
>ce0721786de).
>Thanks for the patience.
>
>Regards,
>Thomas
>
>On Friday, February 14, 2014 12:52:24 Biddiscombe, John A. wrote:
>> Hal
>> 
>> Apologies, I didn’t realize I was using the wrong wrapper.
>> 
>> I recompiled using the bgclang++11 wrapper and things work much better.
>> I first compiled boost ok, but had trouble linking to it - I ran into
>>the
>> cxxABI link error with boost program_options:: __1 etc etc
>> 
>> After a bit of goggling around explained to me the std c++ lib issues,
>>so
>> I had another go using the following settings …
>> 
>> export CC=/gpfs/bbp.cscs.ch/home/biddisco/bgas/apps/clang/bin/bgclang
>> export 
>>CXX=/gpfs/bbp.cscs.ch/home/biddisco/bgas/apps/clang/bin/bgclang++11
>> export PATH=/gpfs/bbp.cscs.ch/home/biddisco/bgas/apps/clang/bin:$PATH
>> 
>> I found some info about building boost with clang and followed
>> instructions here
>> 
>>http://stackoverflow.com/questions/11081818/linking-troubles-with-boostpr
>>og
>> ram-options-on-osx-using-llvm?lq=1
>> I modified tools/build/v2/user-config.jam to include the clang-11 option
>> using clang : 11
>> 
>>     : "/gpfs/bbp.cscs.ch/home/biddisco/bgas/apps/clang/bin/bgclang++11"
>>     : <cxxflags>"-std=c++11 -stdlib=libc++ -ftemplate-depth=512"
>> 
>> <linkflags>"-stdlib=libc++"
>>     ;
>> 
>> 
>> And then proceeded to building boost using the following commands
>> ./bootstrap.sh --with-toolset=clang-11
>> ./b2 -j 16 toolset=clang-11 cxxflags="-fPIC" --threading=multi
>> --without-mpi --without-python
>> --prefix=/gpfs/bbp.cscs.ch/home/biddisco/apps/clang/boost_1_54_0
>> 
>> And boost compiles fine.
>> "The Boost C++ Libraries were successfully built!"
>> 
>> To test, I compiled the boost serialisation demo from this page
>> http://www.boost.org/doc/libs/1_42_0/libs/serialization/example/demo.cpp
>> And also a simple boost::program_options demo and boost::filesystem demo
>> they all run fine
>> 
>> Thank you very much for the help and all the work you’ve put in getting
>> the clang stuff running..
>> 
>> But…
>> 
>> when I run simple demos from the HPX library
>> 
>> bbpbg2:~/bgas/build/hpx$ bin/hello_world
>> terminate called after throwing an instance of 'std::__1::runtime_error'
>>   what():  collate_byname<char>::collate_byname failed to construct for
>> Aborted (core dumped)
>> 
>> 
>> gdb shows me a trace …
>> (gdb) where
>> #0  0x00000fffb3458c5c in raise (sig=6) at
>> ../nptl/sysdeps/unix/sysv/linux/raise.c:67
>> #1  0x00000fffb345abd4 in abort () at abort.c:92
>> #2  0x00000fffb3aa7b00 in __gnu_cxx::__verbose_terminate_handler ()
>>     at 
>> 
>>/bgsys/drivers/V1R2M1/ppc64/toolchain/gnu/gcc-4.4.6/libstdc++-v3/libsupc+
>>+/
>> vterminate.cc:93
>> #3  0x00000fffb3aa4d74 in __cxxabiv1::__terminate (handler=<value
>> optimized out>)
>>     at 
>> 
>>/bgsys/drivers/V1R2M1/ppc64/toolchain/gnu/gcc-4.4.6/libstdc++-v3/libsupc+
>>+/
>> eh_terminate.cc:38
>> #4  0x00000fffb3aa4db8 in std::terminate () at
>> 
>>/bgsys/drivers/V1R2M1/ppc64/toolchain/gnu/gcc-4.4.6/libstdc++-v3/libsupc+
>>+/
>> eh_terminate.cc:48
>> #5  0x00000fffb47b1c14 in .__clang_call_terminate () from
>> 
>>/gpfs/bbp.cscs.ch/home/biddisco/apps/clang/boost_1_54_0/lib/libboost_file
>>sy
>> stem.so.1.54.0
>> #6  0x00000fffb47b48a0 in ._ZNK5boost10filesystem4path7compareERKS1_ ()
>>    from 
>> 
>>/gpfs/bbp.cscs.ch/home/biddisco/apps/clang/boost_1_54_0/lib/libboost_file
>>sy
>> stem.so.1.54.0
>> Backtrace stopped: frame did not save the PC
>> 
>> 
>> It looks very suspicious as there are some stdlib++ appearances in
>>there.
>> 
>> Does anything here give you any idea of what might have gone wrong. I’ve
>> tried a number of rebuilds and the error persists, whilst simple demos
>>run
>> ok. I’m not sure where to look to diagnose what’s up (I’ve contacted the
>> HPX people as well). One question is why the shared clang libc++ links
>>to
>> the stdlibc++ one. If I do an
>> 
>> bbpbg2:~/bgas/build/c++test$ ldd
>> /gpfs/bbp.cscs.ch/home/biddisco/bgas/apps/clang/libc++/lib/libc++.so.1.0
>> 
>> 	linux-vdso64.so.1 =>  (0x00000fff9ad40000)
>> 	libpthread.so.0 =>
>> 
>>/bgsys/drivers/V1R2M1/ppc64/gnu-linux/powerpc64-bgq-linux/lib/libpthread.
>>so
>> .0 (0x00000fff9ab00000)
>> 	librt.so.1 => 
>> /bgsys/drivers/V1R2M1/ppc64/gnu-linux/powerpc64-bgq-linux/lib/librt.so.1
>> (0x00000fff9a9d0000)
>> 	libc.so.6 => 
>> /bgsys/drivers/V1R2M1/ppc64/gnu-linux/powerpc64-bgq-linux/lib/libc.so.6
>> (0x00000fff9a790000)
>> 	libstdc++.so.6 =>
>> 
>>/bgsys/drivers/V1R2M1/ppc64/gnu-linux/powerpc64-bgq-linux/lib/libstdc++.s
>>o.
>> 6 (0x00000fff9a550000)
>> 	/lib64/ld64.so.1 (0x0000000032420000)
>> 	libm.so.6 => 
>> /bgsys/drivers/V1R2M1/ppc64/gnu-linux/powerpc64-bgq-linux/lib/libm.so.6
>> (0x00000fff9a430000)
>> 	libgcc_s.so.1 =>
>> 
>>/bgsys/drivers/V1R2M1/ppc64/gnu-linux/powerpc64-bgq-linux/lib/libgcc_s.so
>>.1
>> (0x00000fff9a320000)
>> 
>> 
>> It seems odd. Could this be causing the trouble? (the demos run fine
>> though, so I guess not).
>> 
>> Anyway, I’ll keep poking around, if anything comes to mind, I’m grateful
>> for help.
>> 
>> Thanks
>> 
>> JB
>>   
>> 
>> _______________________________________________
>> llvm-bgq-discuss mailing list
>> llvm-bgq-discuss at lists.alcf.anl.gov
>> https://lists.alcf.anl.gov/mailman/listinfo/llvm-bgq-discuss
>



More information about the llvm-bgq-discuss mailing list