[Llvm-bgq-discuss] Dynamic linking failure

Thomas Gooding tgooding at us.ibm.com
Fri May 9 10:36:32 CDT 2014


I'm not sure how much faith to put in the ldd output.  It probably be
better to see the following output:
*  objdump -p <filename>
*  run with --strace 0 enabled.  (or the slurm/cobalt equivalent)

The 'ldd' output isn't precisely how dynamic libraries are loaded on CNK.
There are some subtle differences:
1) There's no vdso layer on CNK, so the "linux-vdso64.so.1 =>
(0x00000fff9ae40000)" line is incorrect.
2) 'ldd' doesn't use the same ld.so cache file.  CNK uses:
'/etc/ld.so.bgq.cache'   Whereas 'ldd' appears to use:  "/etc/ld.so.cache"
3) File system paths and permissions may differ between the ionode and FEN.
(although I don't suspect that here)

Comparing a simple dynamically-linked hello world with run on CNK with an
strace enabled.  You'll see they pick up different paths:

% ldd dynhelloworld.elf
	linux-vdso64.so.1 =>  (0x00000fffab9a0000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x0000008078a70000)
	libc.so.6 => /lib64/libc.so.6 (0x0000008078880000)
	/lib64/bgq/ld64.so.1 => /lib64/ld64.so.1 (0x0000000033c70000)

whereas:

% runjob --block R00-M0-N00 --np 1 --strace 0 : dynhelloworld.elf
...
(I) sc_open[0.0:1]: pathname="dynhelloworld.elf", flags=0x00000000 mode=
(I) sc_read[0.0:1]: fd=3, buf=0x1dbfffb130, cnt=832
(I) sc_open[0.0:1]: pathname="/etc/ld.so.bgq.cache", flags=0x00000000 mode=
(I) sc_open[0.0:1]: pathname="/usr/lib64/bgq/tls/libpthread.so.0",
flags=0x00000000 mode=
(I) sc_open[0.0:1]: pathname="/usr/lib64/bgq/libpthread.so.0",
flags=0x00000000 mode=
(I) sc_open[0.0:1]: pathname="/lib64/bgq/tls/libpthread.so.0",
flags=0x00000000 mode=
(I) sc_open[0.0:1]: pathname="/lib64/bgq/libpthread.so.0", flags=0x00000000
mode=
(I) sc_read[0.0:1]: fd=3, buf=0x1dbfffab70, cnt=832
(I) sc_open[0.0:1]: pathname="/usr/lib64/bgq/libc.so.6", flags=0x00000000
mode=
(I) sc_open[0.0:1]: pathname="/lib64/bgq/libc.so.6", flags=0x00000000 mode=
(I) sc_read[0.0:1]: fd=3, buf=0x1dbfffab30, cnt=832

ld.so is very search-y when looking for libraries.  But it found:
	libpthread.so.0 	at /lib64/bgq/libpthread.so.0
(not /lib64/libpthread.so.0)
	libc.so.6 	at /lib64/bgq/libc.so.6		(not /lib64/libc.so.6)

The files in /lib64/bgq/* were compiled for CNK.  Where as /lib64/* are
Linux libraries.

Hope this helps,
Tom

Tom Gooding
Senior Engineer / Blue Gene SW Lead / CAPI
tgooding at us.ibm.com   507-253-0747


llvm-bgq-discuss-bounces at lists.alcf.anl.gov wrote on 05/09/2014 08:24:04
AM:

> From: Hal Finkel <hfinkel at anl.gov>
> To: Phil Miller <mille121 at illinois.edu>
> Cc: llvm-bgq-discuss at lists.alcf.anl.gov
> Date: 05/09/2014 08:25 AM
> Subject: Re: [Llvm-bgq-discuss] Dynamic linking failure
> Sent by: llvm-bgq-discuss-bounces at lists.alcf.anl.gov
>
> ----- Original Message -----
> > From: "Phil Miller" <mille121 at illinois.edu>
> > To: "Hal Finkel" <hfinkel at anl.gov>
> > Cc: llvm-bgq-discuss at lists.alcf.anl.gov
> > Sent: Thursday, May 8, 2014 7:33:23 PM
> > Subject: Re: [Llvm-bgq-discuss] Dynamic linking failure
> >
> >
> > So, when I forced the matter by setting
> >
> > LD_LIBRARY_PATH=/bgsys/drivers/toolchain/
> > V1R2M1_base_4.7.2/gnu-linux-4.
> > 7.2/powerpc64-bgq-linux/lib/:/bgsys/drivers/ppcfloor/comm/lib/
> >
> > in my execution, it was successful. I'm not sure whether that should
> > have worked automatically or not.
> >
>
> Yes, I think so. If it works with ldd on the login then it
> definitely should work without any special LD_LIBRARY_PATH.
>
>  -Hal
>
> >
> >
> >
> > On Thu, May 8, 2014 at 7:21 PM, Hal Finkel < hfinkel at anl.gov > wrote:
> >
> >
> >
> >
> > ----- Original Message -----
> > > From: "Phil Miller" < mille121 at illinois.edu >
> > > To: llvm-bgq-discuss at lists.alcf.anl.gov
> > > Sent: Thursday, May 8, 2014 1:53:10 PM
> > > Subject: [Llvm-bgq-discuss] Dynamic linking failure
> > >
> > >
> > >
> > >
> > > I've compiled my application using bgclang/bgclang++ on Vesta, and
> > > the process goes smoothly. When I use a static linked build of the
> > > system, it runs cleanly.
> > >
> > > I want to try out Address Sanitizer (aka 'asan', activated with
> > > '-fsanitize=address'), which requires dynamic linking. Sadly, that
> > > lets me compile and link, but fails to run. Here's what I'm seeing,
> > > again on Vesta:
> > >
> > > ===============
> > > $ file check
> > > check: ELF 64-bit MSB executable, 64-bit PowerPC or cisco 7500,
> > > version 1 (SYSV), dynamically linked (uses shared libs), for
> > > GNU/Linux 2.4.21, not stripped
> > >
> > > $ echo $LD_LIBRARY_PATH | tr : '\n'
> > > /bgsys/drivers/ppcfloor/comm/lib
> > > /bgsys/drivers/ppcfloor/comm/gcc/lib
> > > /soft/compilers/ibmcmp-feb2014/vac/bg/12.1/bglib64
> > > /soft/compilers/ibmcmp-feb2014/vacpp/bg/12.1/bglib64
> > > /soft/compilers/ibmcmp-feb2014/xlf/bg/14.1/bglib64
> > > /soft/compilers/ibmcmp-feb2014/xlmass/bg/7.3/bglib64
> > > /soft/compilers/ibmcmp-feb2014/xlsmp/bg/3.1/bglib64
> > > /dbhome/db2cat/sqllib/lib64
> > > /dbhome/db2cat/sqllib/lib32
> > >
> > > $ ldd check
> > > linux-vdso64.so.1 => (0x00000fff9ae40000)
> > > libdl.so.2 =>
> > > /bgsys/drivers/toolchain/V1R2M1_base_4.7.2/gnu-linux-4.7.2/
> powerpc64-bgq-linux/lib/libdl.so.2
> > > (0x00000fff9ad20000)
> > > libpami-gcc.so => /bgsys/drivers/ppcfloor/comm/lib/libpami-gcc.so
> > > (0x00000fff9a7b0000)
> > > libpthread.so.0 =>
> > > /bgsys/drivers/toolchain/V1R2M1_base_4.7.2/gnu-linux-4.7.2/
> powerpc64-bgq-linux/lib/libpthread.so.0
> > > (0x00000fff9a690000)
> > > librt.so.1 =>
> > > /bgsys/drivers/toolchain/V1R2M1_base_4.7.2/gnu-linux-4.7.2/
> powerpc64-bgq-linux/lib/librt.so.1
> > > (0x00000fff9a560000)
> > > libm.so.6 =>
> > > /bgsys/drivers/toolchain/V1R2M1_base_4.7.2/gnu-linux-4.7.2/
> powerpc64-bgq-linux/lib/libm.so.6
> > > (0x00000fff9a440000)
> > > libstdc++.so.6 =>
> > > /bgsys/drivers/toolchain/V1R2M1_base_4.7.2/gnu-linux-4.7.2/
> powerpc64-bgq-linux/lib/libstdc++.so.6
> > > (0x00000fff9a210000)
> > > libgcc_s.so.1 =>
> > > /bgsys/drivers/toolchain/V1R2M1_base_4.7.2/gnu-linux-4.7.2/
> powerpc64-bgq-linux/lib/libgcc_s.so.1
> > > (0x00000fff9a100000)
> > > libc.so.6 =>
> > > /bgsys/drivers/toolchain/V1R2M1_base_4.7.2/gnu-linux-4.7.2/
> powerpc64-bgq-linux/lib/libc.so.6
> > > (0x00000fff99ed0000)
> > > /lib64/ld64.so.1 (0x000000003cc80000)
> > >
> > > $ qsub -t 10 -A PARTS -n 1 --mode c1 ./check
> > > 190745
> > >
> > > $ cat 190745.error
> > > 2014-05-08 18:45:16.828 (INFO ) [0x40000a3bc20]
> > > 27642:tatu.runjob.client: scheduler job id is 190745
> > > 2014-05-08 18:45:16.829 (DEBUG) [0x40000a3bc20]
> > > 27642:tatu.runjob.client: the environment variable COBALT_RESID did
> > > not contain a Cobalt reservation id
> > > 2014-05-08 18:45:16.844 (INFO ) [0x400004034d0]
> > > 27642:tatu.runjob.monitor: monitor started
> > > 2014-05-08 18:45:16.855 (INFO ) [0x40000a3bc20]
> > > 27642:ibm.runjob.AbstractOptions: using properties file
> > > /bgsys/local/etc/bg.properties
> > > 2014-05-08 18:45:16.856 (INFO ) [0x40000a3bc20]
> > > 27642:ibm.runjob.AbstractOptions: max open file descriptors: 65536
> > > 2014-05-08 18:45:16.856 (INFO ) [0x40000a3bc20]
> > > 27642:ibm.runjob.AbstractOptions: core file limit:
> > > 18446744073709551615
> > > 2014-05-08 18:45:16.977 (INFO ) [0x400004034d0]
> > > 27642:tatu.runjob.monitor: task record 645048 created
> > > 2014-05-08 18:45:16.978 (INFO ) [0x40000a3bc20]
> > > VST-20420-31531-32:27642:ibm.runjob.client.options.Parser: set
> > > local
> > > socket to runjob_mux from properties file
> > > 2014-05-08 18:45:17.782 (INFO ) [0x400004034d0]
> > > 27642:tatu.runjob.monitor: tracklib completed
> > > 2014-05-08 18:45:19.162 (INFO ) [0x40000a3bc20]
> > > VST-20420-31531-32:848093:ibm.runjob.client.Job: job 848093 started
> > > /gpfs/vesta-home/phil/charm-6.6/pamilrts-bluegeneq-asan-clang/
> tests/util/./check:
> > > error while loading shared libraries: libpami-gcc.so: cannot open
> > > shared object file: No such file or directory
> > > 2014-05-08 18:45:20.952 (INFO ) [0x40000a3bc20]
> > > VST-20420-31531-32:848093:ibm.runjob.client.Job: exited with status
> > > 127
> > > 2014-05-08 18:45:20.952 (WARN ) [0x40000a3bc20]
> > > VST-20420-31531-32:848093:ibm.runjob.client.Job: normal termination
> > > with status 127 from rank 0
> > > 2014-05-08 18:45:20.952 (INFO ) [0x40000a3bc20] tatu.runjob.client:
> > > task exited with status 127
> > > 2014-05-08 18:45:20.952 (INFO ) [0x400004034d0]
> > > 27642:tatu.runjob.monitor: monitor terminating
> > > 2014-05-08 18:45:20.956 (INFO ) [0x40000a3bc20] tatu.runjob.client:
> > > monitor completed
> > > =========
> > >
> > > Have a missed a step in running dynamically linked binaries on
> > > BG/Q?
> >
> > No, I don't think so. I see no reason why this should not work. Can
> > you e-mail support about this?
> >
> > -Hal
> >
> > >
> > > _______________________________________________
> > > llvm-bgq-discuss mailing list
> > > llvm-bgq-discuss at lists.alcf.anl.gov
> > > https://lists.alcf.anl.gov/mailman/listinfo/llvm-bgq-discuss
> > >
> >
> > --
> > Hal Finkel
> > Assistant Computational Scientist
> > Leadership Computing Facility
> > Argonne National Laboratory
> >
> >
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> _______________________________________________
> llvm-bgq-discuss mailing list
> llvm-bgq-discuss at lists.alcf.anl.gov
> https://lists.alcf.anl.gov/mailman/listinfo/llvm-bgq-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alcf.anl.gov/pipermail/llvm-bgq-discuss/attachments/20140509/04b99f7e/attachment-0001.html>


More information about the llvm-bgq-discuss mailing list