[dcmf] Question about DCMF Library from a new user

Thu Feb 7 14:12:46 CST 2008

Rajesh,

These are excellent questions! We need to provide better documentation on 
how the callback flow works ... baring that, I'll answer your questions as 
they come and maybe we can pull together a document after answering these 
questions on the mailing list.

There are two types of callbacks - think of them as "local completion" and 
"remote notification" callbacks.

The local completion callbacks are specified not at registration time, but 
when an individual operation is started (DCMF_Send(), for example). This 
callback is invoked by the dcmf runtime, when the local node calls the 
DCMF_Messager_advance() function, after the source buffer has been 
completely sent.  Once the local completion callback is invoked all 
buffers associated with the operation may be deallocated, etc.

The callbacks that are registered for DCMF_Send (and DCMF_Control, etc) 
with the DCMF_Send_register() function are invoked by the dcmf runtime on 
the remote node when that node call the DCMF_Messager_advance() function. 
Typically all nodes in the system will periodically poll with 
DCMF_Messager_advance() to make progress, however the BGP messager can be 
configured to enable an interupt to be fired when a core receives a 
packet.  In this interrupt mode active polling is not required - although 
you do take a performance hit because of the overhead of processing the 
interrupts.

The remote callbacks for DCMF_Send are invoked before any data has been 
written to the remote node.  There are two callback types and each has 
slightly different use by the application programmer.

The DCMF_RecvSendShort ("short") callbacks are invoked when the entire 
message has been received by the remote node into a temporary location (on 
BGP this is a single packet of data that has been received by the DMA into 
a memory fifo).  The application's responsibility is to copy the data out 
of the temporary buffer and into the final destination buffer. This 
callback type was created specifically to allow the dcmf implementation to 
optimize the performance for small messages.

The DCMF_RecvSend ("long" or "asynhcronous") callbacks are invoked when 
the control information has been received by the remote node into a 
temporary location (on BGP the control information will be contained in a 
single packet).  The application's responsibility is to allocate memory 
(DCMF_Request_t) for the dcmf runtime to use to receive the rest of the 
data, as well as specify the destination buffer and length and a 
("recv_done") callback. This "recv_done" callback is invoked by the dcmf 
runtime when the data has been completely received and written to the 
destination buffer.  Typically applications will free/deallocate the 
DCMF_Request_t memory that was allocated previously.

DCMF_Put and memory regions (i.e., registration, pinning)

The DCMF_Put in the library is just stubbed in as we didn't have a need 
for it in release 1 of the BG/P software.  However, we are actively 
working on adding the DCMF_Put() into the API which will also require a 
memory region API. The existing DCMF_Get API will be updated to use these 
new memory regions objects.  Perhaps we should go into more detail on the 
memory region API in a separate email.

I hope this helps!

Michael Blocksome
Blue Gene Messaging Team Lead
Advanced Systems SW Development
blocksom at us.ibm.com

"Rajesh Nishtala" <rajeshn at eecs.berkeley.edu> 
Sent by: dcmf-bounces at lists.anl-external.org
02/07/2008 12:01 PM

To
dcmf at lists.anl-external.org
cc
upc-devel at lbl.gov
Subject
[dcmf] Question about DCMF Library from a new user

Hi,
 I am porting GASNet, our portable runtime layer for the Berkeley UPC
compiler, to the BlueGene/P and i'm using DCMF as the lower level
messaging layer. I have some high level questions regarding the
library that will influence the design of our BlueGene/P port. The
main difference between our library and MPI is that we focus on
one-sided communication so my main questions are regarding these
issues.

+ When does the callback that gets registered with DCMF_Send() get
called? Does it get called after the data has been committed to the
memory on the remote node or does it simply imply that the data buffer
is safe to reuse on the local node?

+ I notice that when I do an nm on the dcmf libraries there is
DCMF_Put() function, however when i waded through the code a little
bit more I noticed that the function simply called an abort which to
me implies that it is not implemented. Is this why it doesn't show up
in the dcmf.h header files?

+ I have heard that the BlueGene/P supports RDMA operations. Are there
any special considerations for managing the memory registration (i.e.
memory pinning) to enable these operations or is this done
automatically under the covers?

Thanks in advance for any help!

Sincerely,
Rajesh Nishtala
_______________________________________________
dcmf mailing list
dcmf at lists.anl-external.org
http://lists.anl-external.org/cgi-bin/mailman/listinfo/dcmf
http://dcmf.anl-external.org/wiki

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alcf.anl.gov/pipermail/dcmf/attachments/20080207/6aa1e758/attachment.htm>