[Llvm-bgq-discuss] clang code gen error -- Kernel_RanksToCoords
Michael Blocksome
blocksom at us.ibm.com
Fri Jul 26 13:25:28 CDT 2013
Hal,
Tom and I have discovered what appears to be a code gen problem when
compiling the Kernel_RanksToCoords() CNK syscall. See Tom's analysis
below.
GPR4 receives the second parameter, GPR5 receives the third parameter, and
GPR3 is the return value.
---
BTW, I'm using the latest "bgclang" wrapper script, but I've renamed it to
powerpc64-bgq-linux-clang so it plays nice with autoconf.
Michael Blocksome
Blue Gene Messaging
blocksom at us.ibm.com
----- Forwarded by Michael Blocksome/Rochester/IBM on 07/26/2013 01:20 PM
-----
From: Thomas Gooding/Rochester/IBM
To: Michael Blocksome/Rochester/IBM at IBMUS,
Date: 07/26/2013 11:20 AM
Subject: Re: clang
Looks like the addresses are getting truncated to 32-bits.
after ... Kernel_RanksToCoords (8, 0x1dbfffba18, 0x1dbfffba10 {0}) = 14
{4}.16.1: TB=0000000773c39a5c FL_SYSCALLAT:0 Syscall 1055 at
IP=0x0000000001001314 LR=0x00000000010012bc SP=0x0000001dbfffb9a0
(RANKS2COORDS)
{4}.16.1: TB=0000000773c39acc FL_SYSCALLEN:0 Syscall Entry
GPR3=0x0000000000000008 GPR4=0x00000000bfffba18 GPR5=0x00000000bfffba10
GPR6=0x0000001dbfffba50
{4}.16.1: TB=0000000773c3a1a4 FL_SYSCALLRT:0 Syscall Return GPR3=
0x000000000000000e
10012c0: 38 60 04 1f li r3,1055
10012c4: 38 9f 00 a8 addi r4,r31,168
10012c8: 38 df 00 b0 addi r6,r31,176
10012cc: 38 ff 00 b8 addi r7,r31,184
10012d0: 39 1f 00 c0 addi r8,r31,192
10012d4: e8 bf 00 80 ld r5,128(r31) <---- quite a few
load/stores... seems very wasteful.
10012d8: fb df 00 d0 std r30,208(r31)
10012dc: fb bf 00 c8 std r29,200(r31)
10012e0: e9 3f 00 d0 ld r9,208(r31)
10012e4: e9 5f 00 c8 ld r10,200(r31)
10012e8: f8 bf 00 d8 std r5,216(r31)
10012ec: e8 bf 00 d8 ld r5,216(r31)
10012f0: f9 3f 00 b0 std r9,176(r31)
10012f4: f9 5f 00 a8 std r10,168(r31)
10012f8: f8 bf 00 b8 std r5,184(r31)
10012fc: f8 7f 00 c0 std r3,192(r31)
1001300: 80 a4 00 04 lwz r5,4(r4) <---- these are
32-bit loads, should be 64-bit.
1001304: 80 86 00 04 lwz r4,4(r6)
1001308: 80 67 00 04 lwz r3,4(r7)
100130c: 80 08 00 04 lwz r0,4(r8)
1001310: 44 00 00 02 sc
The inline assembly piece is:
#define CNK_SPI_SYSCALL_3(name, arg0, arg1, arg2) \
({ \
register uint64_t r0 __asm__ ("r0") = (__NR_ ## name); \
register uint64_t r3 __asm__ ("r3") = ((uint64_t) (arg0)); \
register uint64_t r4 __asm__ ("r4") = ((uint64_t) (arg1)); \
register uint64_t r5 __asm__ ("r5") = ((uint64_t) (arg2)); \
__asm__ __volatile__ \
("sc" \
: "=&r"(r0),"=&r"(r3),"=&r"(r4),"=&r"(r5) \
: "0"(r0), "1"(r3), "2"(r4), "3"(r5) \
: "r6","r7","r8","r9","r10","r11","r12","cr0","memory"); \
r3; \
})
I don't see where they would be interpreted as 32-bits. In the
input/outputs section, "r" is a register for gcc assembly, which should be
64-bit for a 64-bit compile. I presume LLVM is following gcc assembly
semantics.
The other comment was that Hal seems to be directing people to his
"bgclang" wrapper script. I'm not sure it will make a difference, but
could be something to try.
Tom
Tom Gooding
Senior Engineer / Blue Gene Kernels
507-253-0747 (internal: 553-0747)
From:
Michael Blocksome/Rochester/IBM
To:
Thomas Gooding/Rochester/IBM at IBMUS,
Date:
07/26/2013 08:56 AM
Subject:
clang
Tom,
I wrote a very simple test and compiled with the latest version of
llvm/clang from Hal.
$ cat kernel_rankstocoords.c
#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>
#include "kernel/location.h"
int main (int argc, char * argv[])
{
uint32_t rc = 1;
size_t mapsize = 2*sizeof(BG_CoordinateMapping_t);
BG_CoordinateMapping_t map[2];
uint64_t n = 0;
fprintf (stdout, "before .. Kernel_RanksToCoords (%zu, %p, %p {%lu}) =
%d\n", mapsize, map, &n, n, rc);
rc = Kernel_RanksToCoords (mapsize, map, &n);
fprintf (stdout, "after ... Kernel_RanksToCoords (%zu, %p, %p {%lu}) =
%d\n", mapsize, map, &n, n, rc);
return 0;
}
$ /bghome/blocksom/development/c++11/install/powerpc64-bgq-linux-clang
kernel_rankstocoords.c -o kernel_rankstocoords.clang
-I/bgsys/drivers/ppcfloor/spi/include
-I/bgsys/drivers/ppcfloor/spi/include/kernel/cnk -I/bgsys/drivers/ppcfloor
When I run this I get that same error as when I ran pami compiled with
clang..
$ runjob --block R00-M1-N10 --np 2 : kernel_rankstocoords.clang
before .. Kernel_RanksToCoords (8, 0x1dbfffba18, 0x1dbfffba10 {0}) = 1
before .. Kernel_RanksToCoords (8, 0x1dbfffba18, 0x1dbfffba10 {0}) = 1
after ... Kernel_RanksToCoords (8, 0x1dbfffba18, 0x1dbfffba10 {0}) = 14
after ... Kernel_RanksToCoords (8, 0x1dbfffba18, 0x1dbfffba10 {0}) = 14
.. but it runs fine when compiled with gcc..
$ /bgsys/drivers/ppcfloor/gnu-linux/bin/powerpc64-bgq-linux-gcc
kernel_rankstocoords.c -o kernel_rankstocoords.gcc
-I/bgsys/drivers/ppcfloor/spi/include
-I/bgsys/drivers/ppcfloor/spi/include/kernel/cnk -I/bgsys/drivers/ppcfloor
$ runjob --block R00-M1-N10 --np 2 : kernel_rankstocoords.gcc
before .. Kernel_RanksToCoords (8, 0x1dbfffba54, 0x1dbfffba60 {0}) = 1
before .. Kernel_RanksToCoords (8, 0x1dbfffba54, 0x1dbfffba60 {0}) = 1
after ... Kernel_RanksToCoords (8, 0x1dbfffba54, 0x1dbfffba60 {2}) = 0
after ... Kernel_RanksToCoords (8, 0x1dbfffba54, 0x1dbfffba60 {2}) = 0
Any ideas? Is it something I'm doing wrong, or should I post this to the
llvm-bgq mailing list?
Michael Blocksome
Blue Gene Messaging
blocksom at us.ibm.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alcf.anl.gov/pipermail/llvm-bgq-discuss/attachments/20130726/9a013d42/attachment.html>
More information about the llvm-bgq-discuss
mailing list