[Llvm-bgq-discuss] bgclang r185769-20130706 on vesta/mira

Hal Finkel hfinkel at anl.gov
Sun Jul 14 23:05:05 CDT 2013


----- Original Message -----
> On 2013-07-14, at 0:02 , Hal Finkel <hfinkel at anl.gov> wrote:
> 
> > Erik,
> > 
> > I've updated the installs on vesta/mira to fix this issue as well.
> > First, the underlying problem was that structures that contain a
> > vector4double a member need to have at least 32-byte alignment on
> > the stack. LLVM had been getting this right for most things, but
> > not when passing the structure *by value* to a callee (which
> > requires a separate copy be made for use by the callee).
> > 
> > Also, I'd like to make the following point (for everyone): Because
> > of the way that the PowerPC ABI is defined, passing an aggregate,
> > no matter how small or simple internally, is never exactly the
> > same as passing raw data types. At best, it costs you some extra
> > stack space. But...
> > 
> > struct wrapper {
> >  vector4double x;
> > };
> > 
> > foo(struct wrapper x) { ... }
> > bar(vector4double y) { ... }
> > 
> > When you call bar(y) then y should be passed directly in one of the
> > QPX vector registers. When you call foo(x), then x is passed in a
> > collection of 64-bit general-purpose registers (as are all
> > aggregates), and/or on the stack (depending on how many other
> > function parameters there are and in what order they appear). In
> > C++, it is almost always better to pass by const reference (unless
> > you really do need a copy to modify):
> > 
> > foo(const wrapper &x) { ... }
> > 
> > which should be just as easy for the compiler to inline, but is
> > much more efficient if you end up with non-inlined calls.
> 
> 
> Hal
> 
> Thanks for the pointer regarding the ABI. I assumed this already
> after looking at the disassembled machine code, but didn't make the
> connection that passing arguments by reference is then more
> efficient on PowerPC systems. Note that the x86_64 ABI explicitly
> specifies that such structs are passed in registers, which means
> that passing arguments by value is more efficient in this case. But
> I should switch the PowerPC template specializations to using
> references.

To be clear, in most cases, small objects are passed in registers (although the stack space for them is also allocated). In practice, these objects are often also written into this stack space (and LLVM may be missing some optimizations where this can be avoided), but this is actually irrelevant to the case of passing floating-point values under the current ABI. This is because structures are always passed in general-purpose registers, and never in floating-point registers. There are no direct general-purpose-register to floating-point-register move instructions on PowerPC, and so the only way to transfer data between the GPRs and the FPRs is via the stack.

As I recall, they've added such GPR <-> FPR move instructions to the ISA for the POWER8 (and I think that they had them on the Cell), but of course none of that helps us here :(

Thanks again,
Hal

> 
> -erik
> 
> --
> Erik Schnetter <schnetter at cct.lsu.edu>
> http://www.perimeterinstitute.ca/personal/eschnetter/
> 
> My email is as private as my paper mail. I therefore support
> encrypting
> and signing email messages. Get my PGP key from
> http://keys.gnupg.net.
> 
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory


More information about the llvm-bgq-discuss mailing list