[Llvm-bgq-discuss] clang code gen error -- Kernel_RanksToCoords

Michael Blocksome blocksom at us.ibm.com
Fri Jul 26 13:25:28 CDT 2013


Tom and I have discovered what appears to be a code gen problem when 
compiling the Kernel_RanksToCoords() CNK syscall. See Tom's analysis 

GPR4 receives the second parameter, GPR5 receives the third parameter, and 
GPR3 is the return value.


BTW, I'm using the latest "bgclang" wrapper script, but I've renamed it to 
powerpc64-bgq-linux-clang so it plays nice with autoconf.

Michael Blocksome
Blue Gene Messaging
blocksom at us.ibm.com

----- Forwarded by Michael Blocksome/Rochester/IBM on 07/26/2013 01:20 PM 

From:   Thomas Gooding/Rochester/IBM
To:     Michael Blocksome/Rochester/IBM at IBMUS, 
Date:   07/26/2013 11:20 AM
Subject:        Re: clang

Looks like the addresses are getting truncated to 32-bits.

after ... Kernel_RanksToCoords (8, 0x1dbfffba18, 0x1dbfffba10 {0}) = 14

{4}.16.1: TB=0000000773c39a5c FL_SYSCALLAT:0  Syscall 1055 at 
IP=0x0000000001001314    LR=0x00000000010012bc    SP=0x0000001dbfffb9a0 
{4}.16.1: TB=0000000773c39acc FL_SYSCALLEN:0  Syscall Entry 
GPR3=0x0000000000000008  GPR4=0x00000000bfffba18  GPR5=0x00000000bfffba10 
{4}.16.1: TB=0000000773c3a1a4 FL_SYSCALLRT:0  Syscall Return GPR3=

 10012c0:       38 60 04 1f     li      r3,1055
 10012c4:       38 9f 00 a8     addi    r4,r31,168
 10012c8:       38 df 00 b0     addi    r6,r31,176
 10012cc:       38 ff 00 b8     addi    r7,r31,184
 10012d0:       39 1f 00 c0     addi    r8,r31,192
 10012d4:       e8 bf 00 80     ld      r5,128(r31)     <---- quite a few 
load/stores...  seems very wasteful.
 10012d8:       fb df 00 d0     std     r30,208(r31)
 10012dc:       fb bf 00 c8     std     r29,200(r31)
 10012e0:       e9 3f 00 d0     ld      r9,208(r31)
 10012e4:       e9 5f 00 c8     ld      r10,200(r31)
 10012e8:       f8 bf 00 d8     std     r5,216(r31)
 10012ec:       e8 bf 00 d8     ld      r5,216(r31)
 10012f0:       f9 3f 00 b0     std     r9,176(r31)
 10012f4:       f9 5f 00 a8     std     r10,168(r31)
 10012f8:       f8 bf 00 b8     std     r5,184(r31)
 10012fc:       f8 7f 00 c0     std     r3,192(r31)
 1001300:       80 a4 00 04     lwz     r5,4(r4)       <---- these are 
32-bit loads, should be 64-bit. 
 1001304:       80 86 00 04     lwz     r4,4(r6)
 1001308:       80 67 00 04     lwz     r3,4(r7)
 100130c:       80 08 00 04     lwz     r0,4(r8)
 1001310:       44 00 00 02     sc

The inline assembly piece is:
#define CNK_SPI_SYSCALL_3(name, arg0, arg1, arg2) \
({ \
  register uint64_t r0 __asm__ ("r0") = (__NR_ ## name); \
  register uint64_t r3 __asm__ ("r3") = ((uint64_t) (arg0)); \
  register uint64_t r4 __asm__ ("r4") = ((uint64_t) (arg1)); \
  register uint64_t r5 __asm__ ("r5") = ((uint64_t) (arg2)); \
  __asm__ __volatile__ \
  ("sc" \
   : "=&r"(r0),"=&r"(r3),"=&r"(r4),"=&r"(r5) \
   :   "0"(r0),  "1"(r3),  "2"(r4),  "3"(r5) \
   : "r6","r7","r8","r9","r10","r11","r12","cr0","memory"); \
  r3; \

I don't see where they would be interpreted as 32-bits.  In the 
input/outputs section, "r" is a register for gcc assembly, which should be 
64-bit for a 64-bit compile. I presume LLVM is following gcc assembly 

The other comment was that Hal seems to be directing people to his 
"bgclang" wrapper script.  I'm not sure it will make a difference, but 
could be something to try. 


Tom Gooding
Senior Engineer / Blue Gene Kernels
507-253-0747  (internal:  553-0747)

Michael Blocksome/Rochester/IBM
Thomas Gooding/Rochester/IBM at IBMUS, 
07/26/2013 08:56 AM


I wrote a very simple test and compiled with the latest version of 
llvm/clang from Hal. 

$ cat kernel_rankstocoords.c 

#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>

#include "kernel/location.h"

int main (int argc, char * argv[])
  uint32_t rc = 1;
  size_t mapsize = 2*sizeof(BG_CoordinateMapping_t);
  BG_CoordinateMapping_t map[2];
  uint64_t n = 0;

  fprintf (stdout, "before .. Kernel_RanksToCoords (%zu, %p, %p {%lu}) = 
%d\n", mapsize, map, &n, n, rc);
  rc = Kernel_RanksToCoords (mapsize, map, &n);
  fprintf (stdout, "after ... Kernel_RanksToCoords (%zu, %p, %p {%lu}) = 
%d\n", mapsize, map, &n, n, rc);

  return 0;

$ /bghome/blocksom/development/c++11/install/powerpc64-bgq-linux-clang 
kernel_rankstocoords.c -o kernel_rankstocoords.clang 
-I/bgsys/drivers/ppcfloor/spi/include/kernel/cnk -I/bgsys/drivers/ppcfloor

When I run this I get that same error as when I ran pami compiled with 

$ runjob --block R00-M1-N10 --np 2 : kernel_rankstocoords.clang 
before .. Kernel_RanksToCoords (8, 0x1dbfffba18, 0x1dbfffba10 {0}) = 1
before .. Kernel_RanksToCoords (8, 0x1dbfffba18, 0x1dbfffba10 {0}) = 1
after ... Kernel_RanksToCoords (8, 0x1dbfffba18, 0x1dbfffba10 {0}) = 14
after ... Kernel_RanksToCoords (8, 0x1dbfffba18, 0x1dbfffba10 {0}) = 14

.. but it runs fine when compiled with gcc..

$ /bgsys/drivers/ppcfloor/gnu-linux/bin/powerpc64-bgq-linux-gcc 
kernel_rankstocoords.c -o kernel_rankstocoords.gcc 
-I/bgsys/drivers/ppcfloor/spi/include/kernel/cnk -I/bgsys/drivers/ppcfloor

$ runjob --block R00-M1-N10 --np 2 : kernel_rankstocoords.gcc 
before .. Kernel_RanksToCoords (8, 0x1dbfffba54, 0x1dbfffba60 {0}) = 1
before .. Kernel_RanksToCoords (8, 0x1dbfffba54, 0x1dbfffba60 {0}) = 1
after ... Kernel_RanksToCoords (8, 0x1dbfffba54, 0x1dbfffba60 {2}) = 0
after ... Kernel_RanksToCoords (8, 0x1dbfffba54, 0x1dbfffba60 {2}) = 0

Any ideas?  Is it something I'm doing wrong, or should I post this to the 
llvm-bgq mailing list?

Michael Blocksome
Blue Gene Messaging
blocksom at us.ibm.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alcf.anl.gov/pipermail/llvm-bgq-discuss/attachments/20130726/9a013d42/attachment.html>

More information about the llvm-bgq-discuss mailing list