[Llvm-bgq-discuss] rint() with -ffast-math

Wed Jun 19 12:24:58 CDT 2013

----- Original Message -----
> 
> On Wed, Jun 19, 2013 at 1:15 PM, Hal Finkel < hfinkel at anl.gov >
> wrote:
> 
> 
> 
> 
> 
> ----- Original Message -----
> > 
> > Actually, the Intel AVX instructions have a similar issue: rint()
> > has
> > a fast instruction, but round() does not. On this architecture,
> > round() still does the "right thing", even with -ffast-math, both
> > with gcc and clang.
> > 
> > 
> > Actually, as I just see, gcc generates a short sequence of
> > instructions (five or so) to implement round() properly, whereas
> > clang calls _round.
> 
> LLVM only seems to have special handling of:
> FCEIL, FTRUNC, FRINT, FNEARBYINT, FFLOOR
> (as there is no FROUND and so round() will always give you the
> library call).
> 
> 
> > 
> > 
> > Given this, the behaviour on BGQ is indeed special. I would expect
> > clang to behave consistently -- to either apply this optimization
> > across the board, or nowhere. Do you want to raise the issue on the
> > llvm mailing list?
> 
> Unfortunately, this optimization (as are many low-level fast-math
> optimizations) is target-specific. As a result, I'm not sure that
> you'll even really get the cross-platform consistency that you'd
> like. That having been said, if this change is too strong, then we
> should back it out.
> 
> 
> 
> Yes, it's target specific. Nevertheless, whether rint's tie-breaking
> can be influenced by __FAST_MATH__ should be a consensus decision.
> Either BGQ is over-zealous, or Intel is missing a possible
> optimisation, or llvm makes different speed/accuracy trade-offs on
> different architectures. And the latter would be bad for users.

Agreed. Would you like to write to the list or should I?

 -Hal

> 
> 
> 
> Separately, Clang/LLVM should support backend specialization of
> round() -- and as you imply, the frin instruction does seem to
> implement the semantics of round() more than it implements that of
> rint() (or nearbyint()) [is that right?].
> 
> 
> 
> Yes, frin seems to implement round.
> 
> 
> 
> I know that I discussed this at length with the IBM LLVM
> contributors, and I don't recall if I discussed this on the LLVM
> list, but we could certainly ask for opinions from a wider audience.
> 
> Thanks again,
> Hal
> 
> 
> 
> > 
> > 
> > 
> > -erik
> > 
> > 
> > 
> > 
> > 
> > On Wed, Jun 19, 2013 at 11:52 AM, Jeff Hammond <
> > jhammond at alcf.anl.gov > wrote:
> > 
> > 
> > From
> > http://gcc.gnu.org/onlinedocs/gcc-4.1.1/gcc/Optimize-Options.html :
> > ===========================================================================
> > -ffast-math
> > 
> > Sets -fno-math-errno, -funsafe-math-optimizations,
> > -fno-trapping-math,
> > -ffinite-math-only, -fno-rounding-math, -fno-signaling-nans and
> > fcx-limited-range.
> > 
> > This option causes the preprocessor macro __FAST_MATH__ to be
> > defined.
> > 
> > This option should never be turned on by any -O option since it can
> > result in incorrect output for programs which depend on an exact
> > implementation of IEEE or ISO rules/specifications for math
> > functions.
> > ===========================================================================
> > 
> > Based upon this, I would not count on accurate results when
> > fast-math
> > is defined unless explicitly verified.
> > 
> > Assuming LLVM behaves like GCC and defines __FAST_MATH__, you could
> > also do this:
> > 
> > #ifdef __FAST_MATH__
> > slower_rounding_function_that_is_always_correct(stuff);
> > #else
> > rint(stuff);
> > #endif
> > 
> > Jeff
> > 
> > 
> > 
> > On Wed, Jun 19, 2013 at 10:48 AM, Erik Schnetter <
> > schnetter at cct.lsu.edu > wrote:
> > > On Wed, Jun 19, 2013 at 11:12 AM, Hal Finkel < hfinkel at anl.gov >
> > > wrote:
> > >> 
> > >> ----- Original Message -----
> > >> > 
> > >> > 
> > >> > 
> > >> > The function rint() is supposed to round to the nearest
> > >> > integer,
> > >> > breaking ties to even. With -ffast-math, it breaks ties away
> > >> > from
> > >> > zero. That is, in corner cases the result is incorrectly
> > >> > rounded.
> > >> > 
> > >> > 
> > >> > Is this intended? This (BGQ with Clang) is the first system
> > >> > that
> > >> > does
> > >> > so. (I understand why one would do this given the machine
> > >> > instructions available.)
> > >> 
> > >> Yes, this is the intended behavior (and LLVM will currently do
> > >> this on all
> > >> PPC systems). It is a function of the (odd) way in which the PPC
> > >> frin
> > >> instruction is defined. The upside is that it is much faster
> > >> than
> > >> the libc
> > >> function call. That having been said, I put this optimization
> > >> in,
> > >> and I can
> > >> take it out again ;) [or make it require some other flag]. Is
> > >> the
> > >> behavior
> > >> too different for you?
> > > 
> > > 
> > > I can live with this optimization, I just want to know where
> > > -ffast-math has
> > > its boundaries...
> > > 
> > > -erik
> > > 
> > > --
> > > Erik Schnetter < schnetter at cct.lsu.edu >
> > > http://www.perimeterinstitute.ca/personal/eschnetter/
> > > 
> > 
> > 
> > > _______________________________________________
> > > llvm-bgq-discuss mailing list
> > > llvm-bgq-discuss at lists.alcf.anl.gov
> > > https://lists.alcf.anl.gov/mailman/listinfo/llvm-bgq-discuss
> > > 
> > 
> > 
> > 
> > --
> > Jeff Hammond
> > Argonne Leadership Computing Facility
> > University of Chicago Computation Institute
> > jhammond at alcf.anl.gov / (630) 252-5381
> > http://www.linkedin.com/in/jeffhammond
> > https://wiki.alcf.anl.gov/parts/index.php/User:Jhammond
> > ALCF docs: http://www.alcf.anl.gov/user-guides
> > 
> > 
> > 
> > 
> > --
> > Erik Schnetter < schnetter at cct.lsu.edu >
> > http://www.perimeterinstitute.ca/personal/eschnetter/
> 
> 
> 
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> 
> 
> 
> 
> --
> Erik Schnetter < schnetter at cct.lsu.edu >
> http://www.perimeterinstitute.ca/personal/eschnetter/

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory