[dcmf] running my own mpich

Jeff Parker jjparker at us.ibm.com
Tue Feb 12 16:52:59 CST 2008


>From my previous post to this topic (which you may not have seen)...

We have not posted source to the wiki for building libSPI.cna.a.  The
"runtime source tarball" currently on the wiki contains the DMA SPI source,
but other things are needed to build libSPI.cna.a that were not ready yet.
Shortly, we will post a new runtime source tarball that will build
libSPI.cna.a so you can build it yourself.

As of today, we are not yet ready to post the source for the other .c
files.  Stay tuned.

Jeff Parker
IBM Blue Gene Messaging



                                                                           
             Kazutomo Yoshii                                               
             <kazutomo at mcs.anl                                             
             .gov>                                                      To 
             Sent by:                  Robert Latham <robl at mcs.anl.gov>    
             dcmf-bounces at list                                          cc 
             s.anl-external.or         dcmf at lists.anl-external.org         
             g                                                     Subject 
                                       Re: [dcmf] running my own mpich     
                                                                           
             02/12/2008 04:34                                              
             PM                                                            
                                                                           
                                                                           
                                                                           





I found that we need a new SPI library to fix this problem.

I actually managed to build a new SPI library(only DMA part)
by mixing ppcfloor's binary libSPI.a and objects compiled from
BGP_DMA_runtime.tar.gz. I just wanted to make sure that
the SPI library is a cause or not. With a new hacked SPI library,
my dcmf worked without segv.

Can anyone upload all SPI source codes into the wiki page?
Probably, the runtime SPI library contains the following files:

bgp_cna_SPI.c
spi_collective.c
UPC.c
DMA_Counter.c
DMA_Descriptors.c
DMA_InjFifo.c
DMA_RecFifo.c

We have DMA_*.c now.

- kaz



> I've made some changes to ROMIO and would like to test them out.  I've
> built an mpich library with the the 'make mpich' rule, and that goes
> just fine: i've got an install/bin/mpicc which links in
> install/lib/libdcmfcoll.cnk.a install/lib/libdcmf.cnk.a and
> install/lib/libmpich.cnk.a
>
> So far, everything looks normal.
>
> When I try to run the resulting program, I get a segfault.  Here's the
> output after running the stack dump in one of the lightweight core
> file through addr2line:
>
> 0x010fa338
> DMA_InjFifoRgetFifoFullInit
> ??:0
> 0x01304834
> ??
> ??:0
> 0x010cd56c
> DCMF::DMA::Device::initGroups()
>
/home/robl/src/bgp.comm/sys/build-dcmf/../messaging/devices/prod/dma/Init.cc:308

> 0x010cd91c
> DCMF::DMA::Device::initDMADevice()
>
/home/robl/src/bgp.comm/sys/build-dcmf/../messaging/devices/prod/dma/Init.cc:397

> 0x010bc2dc
> BGPMessager
>
/home/robl/src/bgp.comm/sys/build-dcmf/../messaging/messager/prod/bgp/msgr.h:81

> 0x010b2cec
> DCMF::BGPMessager::generate()
>
/home/robl/src/bgp.comm/sys/build-dcmf/../messaging/messager/prod/bgp/msgr.h:105

> 0x0102563c
> MPID_Init
>
/gpfs/home/robl/src/bgp.comm/lib/mpi/mpich2/src/mpid/dcmf/src/misc/mpid_init.c:63

> 0x0100cf58
> MPIR_Init_thread
> /gpfs/home/robl/src/bgp.comm/lib/mpi/mpich2/src/mpi/init/initthread.c:236
> 0x0100cd1c
> PMPI_Init
> /gpfs/home/robl/src/bgp.comm/lib/mpi/mpich2/src/mpi/init/init.c:93
> 0x010013a4
> main
> /home/robl/src/darray-io.c:51
> 0x011004c0
> generic_start_main
> ../csu/libc-start.c:231
> 0x01100734
> __libc_start_main
> ../sysdeps/unix/sysv/linux/powerpc/libc-start.c:137
> 0xfffffffc
> ??
> ??:0
>
>
> If I had to guess, I'd say that the libdcmf in the development tree is
> incompatible with argonne's V1R1M2_500_2007-071213P driver.  What's
> the best way to test out my ROMIO changes?
>
> Thanks
> ==rob
>

_______________________________________________
dcmf mailing list
dcmf at lists.anl-external.org
http://lists.anl-external.org/cgi-bin/mailman/listinfo/dcmf
http://dcmf.anl-external.org/wiki





More information about the dcmf mailing list