[dcmf] running my own mpich

Kazutomo Yoshii kazutomo at mcs.anl.gov
Tue Feb 12 16:58:57 CST 2008


Jeff,

I just realized your e-mail right after I posted.
Thank you for explaining it again and your quick work for spi lib,
I linked with your libSPI.a and tested my simple mpich sample.
It worked.


I have a general request about a compile time
library version check something like
#if LIBXXX_VERSION <  10
#error "requires libxxx 10 or later"
#endif
or something equivalent.

it would be nice if we can have such thing in future.

- kaz


Jeff Parker wrote:
> From my previous post to this topic (which you may not have seen)...
> 
> We have not posted source to the wiki for building libSPI.cna.a.  The
> "runtime source tarball" currently on the wiki contains the DMA SPI source,
> but other things are needed to build libSPI.cna.a that were not ready yet.
> Shortly, we will post a new runtime source tarball that will build
> libSPI.cna.a so you can build it yourself.
> 
> As of today, we are not yet ready to post the source for the other .c
> files.  Stay tuned.
> 
> Jeff Parker
> IBM Blue Gene Messaging
> 
> 
> 
>                                                                            
>              Kazutomo Yoshii                                               
>              <kazutomo at mcs.anl                                             
>              .gov>                                                      To 
>              Sent by:                  Robert Latham <robl at mcs.anl.gov>    
>              dcmf-bounces at list                                          cc 
>              s.anl-external.or         dcmf at lists.anl-external.org         
>              g                                                     Subject 
>                                        Re: [dcmf] running my own mpich     
>                                                                            
>              02/12/2008 04:34                                              
>              PM                                                            
>                                                                            
>                                                                            
>                                                                            
> 
> 
> 
> 
> 
> I found that we need a new SPI library to fix this problem.
> 
> I actually managed to build a new SPI library(only DMA part)
> by mixing ppcfloor's binary libSPI.a and objects compiled from
> BGP_DMA_runtime.tar.gz. I just wanted to make sure that
> the SPI library is a cause or not. With a new hacked SPI library,
> my dcmf worked without segv.
> 
> Can anyone upload all SPI source codes into the wiki page?
> Probably, the runtime SPI library contains the following files:
> 
> bgp_cna_SPI.c
> spi_collective.c
> UPC.c
> DMA_Counter.c
> DMA_Descriptors.c
> DMA_InjFifo.c
> DMA_RecFifo.c
> 
> We have DMA_*.c now.
> 
> - kaz
> 
> 
> 
>> I've made some changes to ROMIO and would like to test them out.  I've
>> built an mpich library with the the 'make mpich' rule, and that goes
>> just fine: i've got an install/bin/mpicc which links in
>> install/lib/libdcmfcoll.cnk.a install/lib/libdcmf.cnk.a and
>> install/lib/libmpich.cnk.a
>>
>> So far, everything looks normal.
>>
>> When I try to run the resulting program, I get a segfault.  Here's the
>> output after running the stack dump in one of the lightweight core
>> file through addr2line:
>>
>> 0x010fa338
>> DMA_InjFifoRgetFifoFullInit
>> ??:0
>> 0x01304834
>> ??
>> ??:0
>> 0x010cd56c
>> DCMF::DMA::Device::initGroups()
>>
> /home/robl/src/bgp.comm/sys/build-dcmf/../messaging/devices/prod/dma/Init.cc:308
> 
>> 0x010cd91c
>> DCMF::DMA::Device::initDMADevice()
>>
> /home/robl/src/bgp.comm/sys/build-dcmf/../messaging/devices/prod/dma/Init.cc:397
> 
>> 0x010bc2dc
>> BGPMessager
>>
> /home/robl/src/bgp.comm/sys/build-dcmf/../messaging/messager/prod/bgp/msgr.h:81
> 
>> 0x010b2cec
>> DCMF::BGPMessager::generate()
>>
> /home/robl/src/bgp.comm/sys/build-dcmf/../messaging/messager/prod/bgp/msgr.h:105
> 
>> 0x0102563c
>> MPID_Init
>>
> /gpfs/home/robl/src/bgp.comm/lib/mpi/mpich2/src/mpid/dcmf/src/misc/mpid_init.c:63
> 
>> 0x0100cf58
>> MPIR_Init_thread
>> /gpfs/home/robl/src/bgp.comm/lib/mpi/mpich2/src/mpi/init/initthread.c:236
>> 0x0100cd1c
>> PMPI_Init
>> /gpfs/home/robl/src/bgp.comm/lib/mpi/mpich2/src/mpi/init/init.c:93
>> 0x010013a4
>> main
>> /home/robl/src/darray-io.c:51
>> 0x011004c0
>> generic_start_main
>> ../csu/libc-start.c:231
>> 0x01100734
>> __libc_start_main
>> ../sysdeps/unix/sysv/linux/powerpc/libc-start.c:137
>> 0xfffffffc
>> ??
>> ??:0
>>
>>
>> If I had to guess, I'd say that the libdcmf in the development tree is
>> incompatible with argonne's V1R1M2_500_2007-071213P driver.  What's
>> the best way to test out my ROMIO changes?
>>
>> Thanks
>> ==rob
>>
> 
> _______________________________________________
> dcmf mailing list
> dcmf at lists.anl-external.org
> http://lists.anl-external.org/cgi-bin/mailman/listinfo/dcmf
> http://dcmf.anl-external.org/wiki
> 
> 




More information about the dcmf mailing list