[Llvm-bgq-discuss] clang code gen error -- Kernel_RanksToCoords

Hal Finkel hfinkel at anl.gov
Sat Aug 3 07:44:17 CDT 2013


----- Original Message -----
> 
> Hal,
> 
> Tom and I have discovered what appears to be a code gen problem when
> compiling the Kernel_RanksToCoords() CNK syscall. See Tom's analysis
> below.

Thanks! (and thank Tom). Because of the way that LLVM internally names its registers, this needed some additional special handling in the PPC backend. I've fixed this upstream (in r187693) (and so will be included in the next rebase, although this change should trivially apply to the current patchset version as well).

> 
> GPR4 receives the second parameter, GPR5 receives the third
> parameter, and GPR3 is the return value.
> 
> ---
> 
> BTW, I'm using the latest "bgclang" wrapper script, but I've renamed
> it to powerpc64-bgq-linux-clang so it plays nice with autoconf.

This is a good idea, thanks! I've added these symlinks here as well.

 -Hal

> 
> 
> Michael Blocksome
> Blue Gene Messaging
> blocksom at us.ibm.com
> 
> ----- Forwarded by Michael Blocksome/Rochester/IBM on 07/26/2013
> 01:20 PM -----
> 
> From: Thomas Gooding/Rochester/IBM
> To: Michael Blocksome/Rochester/IBM at IBMUS,
> Date: 07/26/2013 11:20 AM
> Subject: Re: clang
> 
> 
> 
> Looks like the addresses are getting truncated to 32-bits.
> 
> after ... Kernel_RanksToCoords (8, 0x1dbfffba18 , 0x1dbfffba10 {0}) =
> 14
> 
> {4}.16.1: TB=0000000773c39a5c FL_SYSCALLAT:0 Syscall 1055 at
> IP=0x0000000001001314 LR=0x00000000010012bc SP=0x0000001dbfffb9a0
> (RANKS2COORDS)
> {4}.16.1: TB=0000000773c39acc FL_SYSCALLEN:0 Syscall Entry
> GPR3=0x0000000000000008 GPR4= 0x00000000bfffba18 GPR5=
> 0x00000000bfffba10 GPR6=0x0000001dbfffba50
> {4}.16.1: TB=0000000773c3a1a4 FL_SYSCALLRT:0 Syscall Return GPR3=
> 0x000000000000000e
> 
> 
> 10012c0: 38 60 04 1f li r3,1055
> 10012c4: 38 9f 00 a8 addi r4,r31,168
> 10012c8: 38 df 00 b0 addi r6,r31,176
> 10012cc: 38 ff 00 b8 addi r7,r31,184
> 10012d0: 39 1f 00 c0 addi r8,r31,192
> 10012d4: e8 bf 00 80 ld r5,128(r31) <---- quite a few load/stores...
> seems very wasteful.
> 10012d8: fb df 00 d0 std r30,208(r31)
> 10012dc: fb bf 00 c8 std r29,200(r31)
> 10012e0: e9 3f 00 d0 ld r9,208(r31)
> 10012e4: e9 5f 00 c8 ld r10,200(r31)
> 10012e8: f8 bf 00 d8 std r5,216(r31)
> 10012ec: e8 bf 00 d8 ld r5,216(r31)
> 10012f0: f9 3f 00 b0 std r9,176(r31)
> 10012f4: f9 5f 00 a8 std r10,168(r31)
> 10012f8: f8 bf 00 b8 std r5,184(r31)
> 10012fc: f8 7f 00 c0 std r3,192(r31)
> 1001300: 80 a4 00 04 lwz r5,4(r4) <---- these are 32-bit loads,
> should be 64-bit.
> 1001304: 80 86 00 04 lwz r4,4(r6)
> 1001308: 80 67 00 04 lwz r3,4(r7)
> 100130c: 80 08 00 04 lwz r0,4(r8)
> 1001310: 44 00 00 02 sc
> 
> 
> The inline assembly piece is:
> #define CNK_SPI_SYSCALL_3(name, arg0, arg1, arg2) \
> ({ \
> register uint64_t r0 __asm__ ("r0") = (__NR_ ## name); \
> register uint64_t r3 __asm__ ("r3") = ((uint64_t) (arg0)); \
> register uint64_t r4 __asm__ ("r4") = ((uint64_t) (arg1)); \
> register uint64_t r5 __asm__ ("r5") = ((uint64_t) (arg2)); \
> __asm__ __volatile__ \
> ("sc" \
> : "=&r"(r0),"=&r"(r3),"=&r"(r4),"=&r"(r5) \
> : "0"(r0), "1"(r3), "2"(r4), "3"(r5) \
> : "r6","r7","r8","r9","r10","r11","r12","cr0","memory"); \
> r3; \
> })
> 
> I don't see where they would be interpreted as 32-bits. In the
> input/outputs section, "r" is a register for gcc assembly, which
> should be 64-bit for a 64-bit compile. I presume LLVM is following
> gcc assembly semantics.
> 
> The other comment was that Hal seems to be directing people to his
> "bgclang" wrapper script. I'm not sure it will make a difference,
> but could be something to try.
> 
> Tom
> 
> Tom Gooding
> Senior Engineer / Blue Gene Kernels
> 507-253-0747 (internal: 553-0747)
> 
> 
> 
> 	From: 	Michael Blocksome/Rochester/IBM
> 	To: 	Thomas Gooding/Rochester/IBM at IBMUS,
> 	Date: 	07/26/2013 08:56 AM
> 	Subject: 	clang
> 
> 
> 
> Tom,
> 
> I wrote a very simple test and compiled with the latest version of
> llvm/clang from Hal.
> 
> $ cat kernel_rankstocoords.c
> 
> #include <stdlib.h>
> #include <stdio.h>
> #include <stdint.h>
> 
> #include "kernel/location.h"
> 
> 
> int main (int argc, char * argv[])
> {
> uint32_t rc = 1;
> size_t mapsize = 2*sizeof(BG_CoordinateMapping_t);
> BG_CoordinateMapping_t map[2];
> uint64_t n = 0;
> 
> fprintf (stdout, "before .. Kernel_RanksToCoords (%zu, %p, %p {%lu})
> = %d\n", mapsize, map, &n, n, rc);
> rc = Kernel_RanksToCoords (mapsize, map, &n);
> fprintf (stdout, "after ... Kernel_RanksToCoords (%zu, %p, %p {%lu})
> = %d\n", mapsize, map, &n, n, rc);
> 
> return 0;
> }
> 
> $
> /bghome/blocksom/development/c++11/install/powerpc64-bgq-linux-clang
> kernel_rankstocoords.c -o kernel_rankstocoords.clang
> -I/bgsys/drivers/ppcfloor/spi/include
> -I/bgsys/drivers/ppcfloor/spi/include/kernel/cnk
> -I/bgsys/drivers/ppcfloor
> 
> 
> When I run this I get that same error as when I ran pami compiled
> with clang..
> 
> $ runjob --block R00-M1-N10 --np 2 : kernel_rankstocoords.clang
> before .. Kernel_RanksToCoords (8, 0x1dbfffba18, 0x1dbfffba10 {0}) =
> 1
> before .. Kernel_RanksToCoords (8, 0x1dbfffba18, 0x1dbfffba10 {0}) =
> 1
> after ... Kernel_RanksToCoords (8, 0x1dbfffba18, 0x1dbfffba10 {0}) =
> 14
> after ... Kernel_RanksToCoords (8, 0x1dbfffba18, 0x1dbfffba10 {0}) =
> 14
> 
> .. but it runs fine when compiled with gcc..
> 
> $ /bgsys/drivers/ppcfloor/gnu-linux/bin/powerpc64-bgq-linux-gcc
> kernel_rankstocoords.c -o kernel_rankstocoords.gcc
> -I/bgsys/drivers/ppcfloor/spi/include
> -I/bgsys/drivers/ppcfloor/spi/include/kernel/cnk
> -I/bgsys/drivers/ppcfloor
> 
> $ runjob --block R00-M1-N10 --np 2 : kernel_rankstocoords.gcc
> before .. Kernel_RanksToCoords (8, 0x1dbfffba54, 0x1dbfffba60 {0}) =
> 1
> before .. Kernel_RanksToCoords (8, 0x1dbfffba54, 0x1dbfffba60 {0}) =
> 1
> after ... Kernel_RanksToCoords (8, 0x1dbfffba54, 0x1dbfffba60 {2}) =
> 0
> after ... Kernel_RanksToCoords (8, 0x1dbfffba54, 0x1dbfffba60 {2}) =
> 0
> 
> 
> Any ideas? Is it something I'm doing wrong, or should I post this to
> the llvm-bgq mailing list?
> 
> 
> Michael Blocksome
> Blue Gene Messaging
> blocksom at us.ibm.com
> 
> 
> 
> _______________________________________________
> llvm-bgq-discuss mailing list
> llvm-bgq-discuss at lists.alcf.anl.gov
> https://lists.alcf.anl.gov/mailman/listinfo/llvm-bgq-discuss
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory


More information about the llvm-bgq-discuss mailing list