[Llvm-bgq-discuss] rint() with -ffast-math

Wed Jun 19 12:22:38 CDT 2013

On Wed, Jun 19, 2013 at 1:15 PM, Hal Finkel <hfinkel at anl.gov> wrote:

> ----- Original Message -----
> >
> > Actually, the Intel AVX instructions have a similar issue: rint() has
> > a fast instruction, but round() does not. On this architecture,
> > round() still does the "right thing", even with -ffast-math, both
> > with gcc and clang.
> >
> >
> > Actually, as I just see, gcc generates a short sequence of
> > instructions (five or so) to implement round() properly, whereas
> > clang calls _round.
>
> LLVM only seems to have special handling of:
> FCEIL, FTRUNC, FRINT, FNEARBYINT, FFLOOR
> (as there is no FROUND and so round() will always give you the library
> call).
>
> >
> >
> > Given this, the behaviour on BGQ is indeed special. I would expect
> > clang to behave consistently -- to either apply this optimization
> > across the board, or nowhere. Do you want to raise the issue on the
> > llvm mailing list?
>
> Unfortunately, this optimization (as are many low-level fast-math
> optimizations) is target-specific. As a result, I'm not sure that you'll
> even really get the cross-platform consistency that you'd like. That having
> been said, if this change is too strong, then we should back it out.
>

Yes, it's target specific. Nevertheless, whether rint's tie-breaking can be
influenced by __FAST_MATH__ should be a consensus decision. Either BGQ is
over-zealous, or Intel is missing a possible optimisation, or llvm makes
different speed/accuracy trade-offs on different architectures. And the
latter would be bad for users.

Separately, Clang/LLVM should support backend specialization of round() --
> and as you imply, the frin instruction does seem to implement the semantics
> of round() more than it implements that of rint() (or nearbyint()) [is that
> right?].
>

Yes, frin seems to implement round.

I know that I discussed this at length with the IBM LLVM contributors, and
> I don't recall if I discussed this on the LLVM list, but we could certainly
> ask for opinions from a wider audience.
>
> Thanks again,
> Hal
>
> >
> >
> >
> > -erik
> >
> >
> >
> >
> >
> > On Wed, Jun 19, 2013 at 11:52 AM, Jeff Hammond <
> > jhammond at alcf.anl.gov > wrote:
> >
> >
> > From
> > http://gcc.gnu.org/onlinedocs/gcc-4.1.1/gcc/Optimize-Options.html :
> >
> ===========================================================================
> > -ffast-math
> >
> > Sets -fno-math-errno, -funsafe-math-optimizations,
> > -fno-trapping-math,
> > -ffinite-math-only, -fno-rounding-math, -fno-signaling-nans and
> > fcx-limited-range.
> >
> > This option causes the preprocessor macro __FAST_MATH__ to be
> > defined.
> >
> > This option should never be turned on by any -O option since it can
> > result in incorrect output for programs which depend on an exact
> > implementation of IEEE or ISO rules/specifications for math
> > functions.
> >
> ===========================================================================
> >
> > Based upon this, I would not count on accurate results when fast-math
> > is defined unless explicitly verified.
> >
> > Assuming LLVM behaves like GCC and defines __FAST_MATH__, you could
> > also do this:
> >
> > #ifdef __FAST_MATH__
> > slower_rounding_function_that_is_always_correct(stuff);
> > #else
> > rint(stuff);
> > #endif
> >
> > Jeff
> >
> >
> >
> > On Wed, Jun 19, 2013 at 10:48 AM, Erik Schnetter <
> > schnetter at cct.lsu.edu > wrote:
> > > On Wed, Jun 19, 2013 at 11:12 AM, Hal Finkel < hfinkel at anl.gov >
> > > wrote:
> > >>
> > >> ----- Original Message -----
> > >> >
> > >> >
> > >> >
> > >> > The function rint() is supposed to round to the nearest integer,
> > >> > breaking ties to even. With -ffast-math, it breaks ties away
> > >> > from
> > >> > zero. That is, in corner cases the result is incorrectly
> > >> > rounded.
> > >> >
> > >> >
> > >> > Is this intended? This (BGQ with Clang) is the first system that
> > >> > does
> > >> > so. (I understand why one would do this given the machine
> > >> > instructions available.)
> > >>
> > >> Yes, this is the intended behavior (and LLVM will currently do
> > >> this on all
> > >> PPC systems). It is a function of the (odd) way in which the PPC
> > >> frin
> > >> instruction is defined. The upside is that it is much faster than
> > >> the libc
> > >> function call. That having been said, I put this optimization in,
> > >> and I can
> > >> take it out again ;) [or make it require some other flag]. Is the
> > >> behavior
> > >> too different for you?
> > >
> > >
> > > I can live with this optimization, I just want to know where
> > > -ffast-math has
> > > its boundaries...
> > >
> > > -erik
> > >
> > > --
> > > Erik Schnetter < schnetter at cct.lsu.edu >
> > > http://www.perimeterinstitute.ca/personal/eschnetter/
> > >
> >
> >
> > > _______________________________________________
> > > llvm-bgq-discuss mailing list
> > > llvm-bgq-discuss at lists.alcf.anl.gov
> > > https://lists.alcf.anl.gov/mailman/listinfo/llvm-bgq-discuss
> > >
> >
> >
> >
> > --
> > Jeff Hammond
> > Argonne Leadership Computing Facility
> > University of Chicago Computation Institute
> > jhammond at alcf.anl.gov / (630) 252-5381
> > http://www.linkedin.com/in/jeffhammond
> > https://wiki.alcf.anl.gov/parts/index.php/User:Jhammond
> > ALCF docs: http://www.alcf.anl.gov/user-guides
> >
> >
> >
> >
> > --
> > Erik Schnetter < schnetter at cct.lsu.edu >
> > http://www.perimeterinstitute.ca/personal/eschnetter/
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
>

-- 
Erik Schnetter <schnetter at cct.lsu.edu>
http://www.perimeterinstitute.ca/personal/eschnetter/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alcf.anl.gov/pipermail/llvm-bgq-discuss/attachments/20130619/f1652350/attachment.html>