[Llvm-bgq-discuss] [alcf-support #325179] Opening application executable failed, errno 2 No such file or directory
Hal Finkel
hfinkel at anl.gov
Fri Feb 3 14:42:04 CST 2017
Hi Jozef,
[-support; cc'ing support and this mailing list is going to be confusing
because not all of the messages will appear on the mailing list]
Can you provide the backtrace? I don't recall running into a problem in
this specific place, but I have seen problems with streams in the past
for various reasons (i.e. things, like basic locale support, that BG/Q
does not support).
-Hal
On 02/01/2017 12:29 PM, Jozsef Bakosi wrote:
> Hi Ramesh and Tim,
>
> Thanks for your help. I recompiled with debug info, ran using a single core, and
> used the coreprocessor to find that I get the segfault from the standard
> library, libc++:
>
> Location: /soft/compilers/bgclang/r284961-stable/libc++/include/c++/v1/sstream:246:
>
> 241 template <class _CharT, class _Traits, class _Allocator>
> 242 basic_stringbuf<_CharT, _Traits, _Allocator>::basic_stringbuf(ios_base::openmode __wch)
> 243 : __hm_(0),
> 244 __mode_(__wch)
> 245 {
> 246 str(string_type());
> 247 }
>
> I'm CCing the bgclang list. Has anyone ever seen this basic_stringbuf
> constructor segfaulting at this location? Is there another libc++ version I can
> try?
>
> In the meantime, I will probably try using gnu stdlibc++ instead of libc++.
>
> Thanks for all your help,
> Jozsef
>
> On 01.31.2017 22:16, Balakrishnan, Ramesh wrote:
>> We have a perl based tool called [1]coreprocessor.pl Make sure you
>> compile your code with the -g flag (in addition to the others that you
>> use) and use this tool to look at the core files (assuming that you are
>> getting core files). If you are not getting core files, you may want to
>> force the job to produce core files by using [2]--env
>> BG_COREDUMPONEXIT=1 in your qsub invocation.
>>
>> Hope this helps.
>>
>> Ramesh
>>
>> On Jan 31, 2017, at 3:56 PM, Jozsef Bakosi <[3]jbakosi at lanl.gov> wrote:
>>
>> Hi Ramesh,
>> I have built the executable using mpic++11. Is there a way to get more
>> information than the following?
>> 2017-01-31 21:41:37.936 (WARN ) [0x4000122bde0]
>> CET-02400-13731-128:1911876:ibm.runjob.client.Job: terminated by signal
>> 11
>> 2017-01-31 21:41:37.936 (WARN ) [0x4000122bde0]
>> CET-02400-13731-128:1911876:ibm.runjob.client.Job: abnormal termination
>> by signal 11 from rank 16
>> Thanks,
>> Jozsef
>> On 01.31.2017 21:32, Balakrishnan, Ramesh wrote:
>>
>> Jozsef,
>> I am not sure how you are building your code, but I noticed in
>> your
>> earlier email that you are using bgclang++11. bgclang++11 is fine
>> for
>> non-MPI builds, but you will need to pull in a long list of
>> libraries
>> if you want to use bgclang++11 for buildign MPI code, and this
>> route
>> can lead to runtime errors. Instead, can you try building your MPI
>> code
>> with mpiclang++11 as opposed to bgclang++11. The mpiclang++11
>> wrapper,
>> around the bgclang++11 compiler, will pull in all of the necessary
>> libraries necessary for your MPI code.
>> Ramesh
>> On Jan 31, 2017, at 2:00 PM, Jozsef Bakosi
>> <[1][4]jbakosi at lanl.gov> wrote:
>> Hi Ramesh,
>> Based on your qsub line I tried this:
>> $ qsub -t 10 -n 1 --mode c16
>> /home/jbakosi/code/quinoa/build/clang/Main/unittest -v
>> and beside 16 core files, I get, in the job error file:
>> 2017-01-31 19:51:26.031 (INFO ) [0x4000122bde0]
>> CET-40000-51331-128:1911641:ibm.runjob.client.Job: job 1911641
>> started
>> 2017-01-31 19:51:31.066 (INFO ) [0x40000c334e0]
>> 15824:tatu.runjob.monitor: tracklib completed
>> 2017-01-31 19:51:43.674 (WARN ) [0x4000122bde0]
>> CET-40000-51331-128:1911641:ibm.runjob.client.Job: terminated by
>> signal
>> 11
>> 2017-01-31 19:51:43.675 (WARN ) [0x4000122bde0]
>> CET-40000-51331-128:1911641:ibm.runjob.client.Job: abnormal
>> termination
>> by signal 11 from rank 4
>> 2017-01-31 19:51:43.675 (INFO ) [0x4000122bde0]
>> tatu.runjob.client:
>> task terminated by signal 11
>> I guess it started fine, but it segfaults right away?
>> How can I get a more detailed output from my application? My job
>> output
>> file is
>> zero length.
>> Jozsef
>> References
>> 1. [5]mailto:jbakosi at lanl.gov
>>
>> References
>>
>> 1. http://www.alcf.anl.gov/user-guides/coreprocessor
>> 2. https://www.alcf.anl.gov/user-guides/core-file-settings
--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
More information about the llvm-bgq-discuss
mailing list