[Llvm-bgq-discuss] clang code gen error -- Kernel_RanksToCoords

Hal Finkel hfinkel at anl.gov
Thu Aug 15 13:49:11 CDT 2013


Michael,

You're right... I had the wrong patches in the archive. I've now replaced the archive on the web site with one with the right files.

Thanks again,
Hal

----- Original Message -----
> Hal,
> 
> I noticed that your latest code drop (r188410-20130814) contain older
> patches for clang/llvm/compiler-rt .. is this intentional?
> 
> $ tar tzf r188410-20130814-files.tar.gz | grep patch
> r188410-20130814-files/clang-bgq-r186563-20130718.patch
> r188410-20130814-files/llvm-bgq-r186563-20130718.patch
> r188410-20130814-files/compiler-rt-bgq-r186563-20130718.patch
> 
> I did a rebuild and found that the cnk syscall fix - described below
> - wasn't there. Is this due to the "old" patches?
> 
> 
> Michael Blocksome
> Blue Gene Messaging
> blocksom at us.ibm.com
> 
> 
> 
> 
> From: Hal Finkel <hfinkel at anl.gov>
> To: Michael Blocksome/Rochester/IBM at IBMUS,
> Cc: llvm-bgq-discuss at lists.alcf.anl.gov
> Date: 08/03/2013 07:44 AM
> Subject: Re: [Llvm-bgq-discuss] clang code gen error --
> Kernel_RanksToCoords
> 
> 
> 
> 
> ----- Original Message -----
> > 
> > Hal,
> > 
> > Tom and I have discovered what appears to be a code gen problem
> > when
> > compiling the Kernel_RanksToCoords() CNK syscall. See Tom's
> > analysis
> > below.
> 
> Thanks! (and thank Tom). Because of the way that LLVM internally
> names its registers, this needed some additional special handling in
> the PPC backend. I've fixed this upstream (in r187693) (and so will
> be included in the next rebase, although this change should
> trivially apply to the current patchset version as well).
> 
> > 
> > GPR4 receives the second parameter, GPR5 receives the third
> > parameter, and GPR3 is the return value.
> > 
> > ---
> > 
> > BTW, I'm using the latest "bgclang" wrapper script, but I've
> > renamed
> > it to powerpc64-bgq-linux-clang so it plays nice with autoconf.
> 
> This is a good idea, thanks! I've added these symlinks here as well.
> 
> -Hal
> 
> > 
> > 
> > Michael Blocksome
> > Blue Gene Messaging
> > blocksom at us.ibm.com
> > 
> > ----- Forwarded by Michael Blocksome/Rochester/IBM on 07/26/2013
> > 01:20 PM -----
> > 
> > From: Thomas Gooding/Rochester/IBM
> > To: Michael Blocksome/Rochester/IBM at IBMUS,
> > Date: 07/26/2013 11:20 AM
> > Subject: Re: clang
> > 
> > 
> > 
> > Looks like the addresses are getting truncated to 32-bits.
> > 
> > after ... Kernel_RanksToCoords (8, 0x1dbfffba18 , 0x1dbfffba10 {0})
> > =
> > 14
> > 
> > {4}.16.1: TB=0000000773c39a5c FL_SYSCALLAT:0 Syscall 1055 at
> > IP=0x0000000001001314 LR=0x00000000010012bc SP=0x0000001dbfffb9a0
> > (RANKS2COORDS)
> > {4}.16.1: TB=0000000773c39acc FL_SYSCALLEN:0 Syscall Entry
> > GPR3=0x0000000000000008 GPR4= 0x00000000bfffba18 GPR5=
> > 0x00000000bfffba10 GPR6=0x0000001dbfffba50
> > {4}.16.1: TB=0000000773c3a1a4 FL_SYSCALLRT:0 Syscall Return GPR3=
> > 0x000000000000000e
> > 
> > 
> > 10012c0: 38 60 04 1f li r3,1055
> > 10012c4: 38 9f 00 a8 addi r4,r31,168
> > 10012c8: 38 df 00 b0 addi r6,r31,176
> > 10012cc: 38 ff 00 b8 addi r7,r31,184
> > 10012d0: 39 1f 00 c0 addi r8,r31,192
> > 10012d4: e8 bf 00 80 ld r5,128(r31) <---- quite a few
> > load/stores...
> > seems very wasteful.
> > 10012d8: fb df 00 d0 std r30,208(r31)
> > 10012dc: fb bf 00 c8 std r29,200(r31)
> > 10012e0: e9 3f 00 d0 ld r9,208(r31)
> > 10012e4: e9 5f 00 c8 ld r10,200(r31)
> > 10012e8: f8 bf 00 d8 std r5,216(r31)
> > 10012ec: e8 bf 00 d8 ld r5,216(r31)
> > 10012f0: f9 3f 00 b0 std r9,176(r31)
> > 10012f4: f9 5f 00 a8 std r10,168(r31)
> > 10012f8: f8 bf 00 b8 std r5,184(r31)
> > 10012fc: f8 7f 00 c0 std r3,192(r31)
> > 1001300: 80 a4 00 04 lwz r5,4(r4) <---- these are 32-bit loads,
> > should be 64-bit.
> > 1001304: 80 86 00 04 lwz r4,4(r6)
> > 1001308: 80 67 00 04 lwz r3,4(r7)
> > 100130c: 80 08 00 04 lwz r0,4(r8)
> > 1001310: 44 00 00 02 sc
> > 
> > 
> > The inline assembly piece is:
> > #define CNK_SPI_SYSCALL_3(name, arg0, arg1, arg2) \
> > ({ \
> > register uint64_t r0 __asm__ ("r0") = (__NR_ ## name); \
> > register uint64_t r3 __asm__ ("r3") = ((uint64_t) (arg0)); \
> > register uint64_t r4 __asm__ ("r4") = ((uint64_t) (arg1)); \
> > register uint64_t r5 __asm__ ("r5") = ((uint64_t) (arg2)); \
> > __asm__ __volatile__ \
> > ("sc" \
> > : "=&r"(r0),"=&r"(r3),"=&r"(r4),"=&r"(r5) \
> > : "0"(r0), "1"(r3), "2"(r4), "3"(r5) \
> > : "r6","r7","r8","r9","r10","r11","r12","cr0","memory"); \
> > r3; \
> > })
> > 
> > I don't see where they would be interpreted as 32-bits. In the
> > input/outputs section, "r" is a register for gcc assembly, which
> > should be 64-bit for a 64-bit compile. I presume LLVM is following
> > gcc assembly semantics.
> > 
> > The other comment was that Hal seems to be directing people to his
> > "bgclang" wrapper script. I'm not sure it will make a difference,
> > but could be something to try.
> > 
> > Tom
> > 
> > Tom Gooding
> > Senior Engineer / Blue Gene Kernels
> > 507-253-0747 (internal: 553-0747)
> > 
> > 
> > 
> > From: Michael Blocksome/Rochester/IBM
> > To: Thomas Gooding/Rochester/IBM at IBMUS,
> > Date: 07/26/2013 08:56 AM
> > Subject: clang
> > 
> > 
> > 
> > Tom,
> > 
> > I wrote a very simple test and compiled with the latest version of
> > llvm/clang from Hal.
> > 
> > $ cat kernel_rankstocoords.c
> > 
> > #include <stdlib.h>
> > #include <stdio.h>
> > #include <stdint.h>
> > 
> > #include "kernel/location.h"
> > 
> > 
> > int main (int argc, char * argv[])
> > {
> > uint32_t rc = 1;
> > size_t mapsize = 2*sizeof(BG_CoordinateMapping_t);
> > BG_CoordinateMapping_t map[2];
> > uint64_t n = 0;
> > 
> > fprintf (stdout, "before .. Kernel_RanksToCoords (%zu, %p, %p
> > {%lu})
> > = %d\n", mapsize, map, &n, n, rc);
> > rc = Kernel_RanksToCoords (mapsize, map, &n);
> > fprintf (stdout, "after ... Kernel_RanksToCoords (%zu, %p, %p
> > {%lu})
> > = %d\n", mapsize, map, &n, n, rc);
> > 
> > return 0;
> > }
> > 
> > $
> > /bghome/blocksom/development/c++11/install/powerpc64-bgq-linux-clang
> > kernel_rankstocoords.c -o kernel_rankstocoords.clang
> > -I/bgsys/drivers/ppcfloor/spi/include
> > -I/bgsys/drivers/ppcfloor/spi/include/kernel/cnk
> > -I/bgsys/drivers/ppcfloor
> > 
> > 
> > When I run this I get that same error as when I ran pami compiled
> > with clang..
> > 
> > $ runjob --block R00-M1-N10 --np 2 : kernel_rankstocoords.clang
> > before .. Kernel_RanksToCoords (8, 0x1dbfffba18, 0x1dbfffba10 {0})
> > =
> > 1
> > before .. Kernel_RanksToCoords (8, 0x1dbfffba18, 0x1dbfffba10 {0})
> > =
> > 1
> > after ... Kernel_RanksToCoords (8, 0x1dbfffba18, 0x1dbfffba10 {0})
> > =
> > 14
> > after ... Kernel_RanksToCoords (8, 0x1dbfffba18, 0x1dbfffba10 {0})
> > =
> > 14
> > 
> > .. but it runs fine when compiled with gcc..
> > 
> > $ /bgsys/drivers/ppcfloor/gnu-linux/bin/powerpc64-bgq-linux-gcc
> > kernel_rankstocoords.c -o kernel_rankstocoords.gcc
> > -I/bgsys/drivers/ppcfloor/spi/include
> > -I/bgsys/drivers/ppcfloor/spi/include/kernel/cnk
> > -I/bgsys/drivers/ppcfloor
> > 
> > $ runjob --block R00-M1-N10 --np 2 : kernel_rankstocoords.gcc
> > before .. Kernel_RanksToCoords (8, 0x1dbfffba54, 0x1dbfffba60 {0})
> > =
> > 1
> > before .. Kernel_RanksToCoords (8, 0x1dbfffba54, 0x1dbfffba60 {0})
> > =
> > 1
> > after ... Kernel_RanksToCoords (8, 0x1dbfffba54, 0x1dbfffba60 {2})
> > =
> > 0
> > after ... Kernel_RanksToCoords (8, 0x1dbfffba54, 0x1dbfffba60 {2})
> > =
> > 0
> > 
> > 
> > Any ideas? Is it something I'm doing wrong, or should I post this
> > to
> > the llvm-bgq mailing list?
> > 
> > 
> > Michael Blocksome
> > Blue Gene Messaging
> > blocksom at us.ibm.com
> > 
> > 
> > 
> > _______________________________________________
> > llvm-bgq-discuss mailing list
> > llvm-bgq-discuss at lists.alcf.anl.gov
> > https://lists.alcf.anl.gov/mailman/listinfo/llvm-bgq-discuss
> > 
> 
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> 
> 
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory


More information about the llvm-bgq-discuss mailing list