[Llvm-bgq-discuss] clang code gen error -- Kernel_RanksToCoords

Michael Blocksome blocksom at us.ibm.com
Fri Jul 26 13:25:28 CDT 2013


Hal,

Tom and I have discovered what appears to be a code gen problem when 
compiling the Kernel_RanksToCoords() CNK syscall. See Tom's analysis 
below.

GPR4 receives the second parameter, GPR5 receives the third parameter, and 
GPR3 is the return value.

---

BTW, I'm using the latest "bgclang" wrapper script, but I've renamed it to 
powerpc64-bgq-linux-clang so it plays nice with autoconf.


Michael Blocksome
Blue Gene Messaging
blocksom at us.ibm.com

----- Forwarded by Michael Blocksome/Rochester/IBM on 07/26/2013 01:20 PM 
-----

From:   Thomas Gooding/Rochester/IBM
To:     Michael Blocksome/Rochester/IBM at IBMUS, 
Date:   07/26/2013 11:20 AM
Subject:        Re: clang


Looks like the addresses are getting truncated to 32-bits.

after ... Kernel_RanksToCoords (8, 0x1dbfffba18, 0x1dbfffba10 {0}) = 14

{4}.16.1: TB=0000000773c39a5c FL_SYSCALLAT:0  Syscall 1055 at 
IP=0x0000000001001314    LR=0x00000000010012bc    SP=0x0000001dbfffb9a0 
(RANKS2COORDS)
{4}.16.1: TB=0000000773c39acc FL_SYSCALLEN:0  Syscall Entry 
GPR3=0x0000000000000008  GPR4=0x00000000bfffba18  GPR5=0x00000000bfffba10 
GPR6=0x0000001dbfffba50
{4}.16.1: TB=0000000773c3a1a4 FL_SYSCALLRT:0  Syscall Return GPR3=
0x000000000000000e


 10012c0:       38 60 04 1f     li      r3,1055
 10012c4:       38 9f 00 a8     addi    r4,r31,168
 10012c8:       38 df 00 b0     addi    r6,r31,176
 10012cc:       38 ff 00 b8     addi    r7,r31,184
 10012d0:       39 1f 00 c0     addi    r8,r31,192
 10012d4:       e8 bf 00 80     ld      r5,128(r31)     <---- quite a few 
load/stores...  seems very wasteful.
 10012d8:       fb df 00 d0     std     r30,208(r31)
 10012dc:       fb bf 00 c8     std     r29,200(r31)
 10012e0:       e9 3f 00 d0     ld      r9,208(r31)
 10012e4:       e9 5f 00 c8     ld      r10,200(r31)
 10012e8:       f8 bf 00 d8     std     r5,216(r31)
 10012ec:       e8 bf 00 d8     ld      r5,216(r31)
 10012f0:       f9 3f 00 b0     std     r9,176(r31)
 10012f4:       f9 5f 00 a8     std     r10,168(r31)
 10012f8:       f8 bf 00 b8     std     r5,184(r31)
 10012fc:       f8 7f 00 c0     std     r3,192(r31)
 1001300:       80 a4 00 04     lwz     r5,4(r4)       <---- these are 
32-bit loads, should be 64-bit. 
 1001304:       80 86 00 04     lwz     r4,4(r6)
 1001308:       80 67 00 04     lwz     r3,4(r7)
 100130c:       80 08 00 04     lwz     r0,4(r8)
 1001310:       44 00 00 02     sc


The inline assembly piece is:
#define CNK_SPI_SYSCALL_3(name, arg0, arg1, arg2) \
({ \
  register uint64_t r0 __asm__ ("r0") = (__NR_ ## name); \
  register uint64_t r3 __asm__ ("r3") = ((uint64_t) (arg0)); \
  register uint64_t r4 __asm__ ("r4") = ((uint64_t) (arg1)); \
  register uint64_t r5 __asm__ ("r5") = ((uint64_t) (arg2)); \
  __asm__ __volatile__ \
  ("sc" \
   : "=&r"(r0),"=&r"(r3),"=&r"(r4),"=&r"(r5) \
   :   "0"(r0),  "1"(r3),  "2"(r4),  "3"(r5) \
   : "r6","r7","r8","r9","r10","r11","r12","cr0","memory"); \
  r3; \
})

I don't see where they would be interpreted as 32-bits.  In the 
input/outputs section, "r" is a register for gcc assembly, which should be 
64-bit for a 64-bit compile. I presume LLVM is following gcc assembly 
semantics. 

The other comment was that Hal seems to be directing people to his 
"bgclang" wrapper script.  I'm not sure it will make a difference, but 
could be something to try. 

Tom

Tom Gooding
Senior Engineer / Blue Gene Kernels
507-253-0747  (internal:  553-0747)




From:
Michael Blocksome/Rochester/IBM
To:
Thomas Gooding/Rochester/IBM at IBMUS, 
Date:
07/26/2013 08:56 AM
Subject:
clang


Tom,

I wrote a very simple test and compiled with the latest version of 
llvm/clang from Hal. 

$ cat kernel_rankstocoords.c 

#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>

#include "kernel/location.h"


int main (int argc, char * argv[])
{
  uint32_t rc = 1;
  size_t mapsize = 2*sizeof(BG_CoordinateMapping_t);
  BG_CoordinateMapping_t map[2];
  uint64_t n = 0;

  fprintf (stdout, "before .. Kernel_RanksToCoords (%zu, %p, %p {%lu}) = 
%d\n", mapsize, map, &n, n, rc);
  rc = Kernel_RanksToCoords (mapsize, map, &n);
  fprintf (stdout, "after ... Kernel_RanksToCoords (%zu, %p, %p {%lu}) = 
%d\n", mapsize, map, &n, n, rc);

  return 0;
}

$ /bghome/blocksom/development/c++11/install/powerpc64-bgq-linux-clang 
kernel_rankstocoords.c -o kernel_rankstocoords.clang 
-I/bgsys/drivers/ppcfloor/spi/include 
-I/bgsys/drivers/ppcfloor/spi/include/kernel/cnk -I/bgsys/drivers/ppcfloor


When I run this I get that same error as when I ran pami compiled with 
clang..

$ runjob --block R00-M1-N10 --np 2 : kernel_rankstocoords.clang 
before .. Kernel_RanksToCoords (8, 0x1dbfffba18, 0x1dbfffba10 {0}) = 1
before .. Kernel_RanksToCoords (8, 0x1dbfffba18, 0x1dbfffba10 {0}) = 1
after ... Kernel_RanksToCoords (8, 0x1dbfffba18, 0x1dbfffba10 {0}) = 14
after ... Kernel_RanksToCoords (8, 0x1dbfffba18, 0x1dbfffba10 {0}) = 14

.. but it runs fine when compiled with gcc..

$ /bgsys/drivers/ppcfloor/gnu-linux/bin/powerpc64-bgq-linux-gcc 
kernel_rankstocoords.c -o kernel_rankstocoords.gcc 
-I/bgsys/drivers/ppcfloor/spi/include 
-I/bgsys/drivers/ppcfloor/spi/include/kernel/cnk -I/bgsys/drivers/ppcfloor

$ runjob --block R00-M1-N10 --np 2 : kernel_rankstocoords.gcc 
before .. Kernel_RanksToCoords (8, 0x1dbfffba54, 0x1dbfffba60 {0}) = 1
before .. Kernel_RanksToCoords (8, 0x1dbfffba54, 0x1dbfffba60 {0}) = 1
after ... Kernel_RanksToCoords (8, 0x1dbfffba54, 0x1dbfffba60 {2}) = 0
after ... Kernel_RanksToCoords (8, 0x1dbfffba54, 0x1dbfffba60 {2}) = 0


Any ideas?  Is it something I'm doing wrong, or should I post this to the 
llvm-bgq mailing list?


Michael Blocksome
Blue Gene Messaging
blocksom at us.ibm.com


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alcf.anl.gov/pipermail/llvm-bgq-discuss/attachments/20130726/9a013d42/attachment.html>


More information about the llvm-bgq-discuss mailing list