[Llvm-bgq-discuss] QPX SLEEF (a SIMD math library)

Hal Finkel hfinkel at anl.gov
Tue Aug 27 19:07:10 CDT 2013


FYI...

In response to a local request, I've created a Fortran interface include file for this (for use by bgxlf). So, for example, you can compile the following using bgxlf95:

include "qpxmath.include"
vector(real(8)) x, y, z
z = xatan2(x, y)
end

(for those with their own installs, this new file is in the sleef patch in the r189357-20130827 (v2) archive I've put on the trac page: https://trac.alcf.anl.gov/projects/llvm-bgq

 -Hal

----- Original Message -----
> Hello everyone,
> 
> I've completed an initial port of Naoki Shibata's SLEEF (SIMD Library
> for Evaluating Elementary Functions) to the BG/Q. For our purposes,
> the library is an open-source (public domain) alternative to
> libmass_simd. Most functions are slower than libmass_simd, but tend
> to be more accurate, and handle corner cases (and infinitys, nans,
> etc.) better than libmass_simd (and are still much faster than
> computing 4 x the scalar function).
> 
> You don't need any special compiler flags to use the library with
> bgclang. To use with bgxlc, add: -I/home/projects/llvm/sleef/include
> -L/home/projects/llvm/sleef/lib -lsleef
> 
> With either compiler, you'll need to: #include <qpxmath.h>
> 
> The following functions are provided:
> vector4double xldexp(vector4double x, const int *q);
> void xilogb(vector4double d, int *l);
> 
> vector4double xsin(vector4double d);
> vector4double xcos(vector4double d);
> void xsincos(vector4double d, vector4double *ds, vector4double *dc);
> vector4double xtan(vector4double d);
> vector4double xasin(vector4double s);
> vector4double xacos(vector4double s);
> vector4double xatan(vector4double s);
> vector4double xatan2(vector4double y, vector4double x);
> vector4double xlog(vector4double d);
> vector4double xexp(vector4double d);
> vector4double xpow(vector4double x, vector4double y);
> 
> vector4double xsinh(vector4double d);
> vector4double xcosh(vector4double d);
> vector4double xtanh(vector4double d);
> vector4double xasinh(vector4double s);
> vector4double xacosh(vector4double s);
> vector4double xatanh(vector4double s);
> 
> vector4double xcbrt(vector4double d);
> 
> vector4double xexp2(vector4double a);
> vector4double xexp10(vector4double a);
> vector4double xexpm1(vector4double a);
> vector4double xlog10(vector4double a);
> vector4double xlog1p(vector4double a);
> 
> vector4double xsin_u1(vector4double d);
> vector4double xcos_u1(vector4double d);
> void xsincos_u1(vector4double d, vector4double *ds, vector4double
> *dc);
> vector4double xtan_u1(vector4double d);
> vector4double xasin_u1(vector4double s);
> vector4double xacos_u1(vector4double s);
> vector4double xatan_u1(vector4double s);
> vector4double xatan2_u1(vector4double y, vector4double x);
> vector4double xlog_u1(vector4double d);
> vector4double xcbrt_u1(vector4double d);
> 
> plus single precision versions (which are named like the
> double-precision variants but have an 'f' as a suffix like this):
> ...
> vector4double xsinf(vector4double d);
> vector4double xcosf(vector4double d);
> ...
> vector4double xsinf_u1(vector4double d);
> vector4double xcosf_u1(vector4double d);
> ...
> 
> I've attached a file (sleef-vs-mass-simd.txt) showing cycle counts
> for all of the SLEEF functions, the corresponding libmass_simd
> functions, and 4 x the cost of the reference libm function. I've
> also attached a file (sleef-max-error.txt) showing the maximum error
> (in ULPs) for all of the SLEEF functions. The error can be compared
> to those for the corresponding libmass_simd functions documented
> here: http://www-01.ibm.com/support/docview.wss?uid=swg27006978
> 
> For convenience, if you define QPXMATH_MASS_SIMD_FUNCTIONS before
> including the qpxmath.h header, aliases will also be defined for
> libmass_simd function names (sind4, etc.). Note, however, that
> libmass_simd provides some functions not (yet) provided here. Also,
> SLEEF provides vectorized ldexp and ilogb functions (which
> libmass_simd does not provide).
> 
> The patches to SLEEF are available in the llvm-bgq archives
> (https://trac.alcf.anl.gov/projects/llvm-bgq/), and the source code
> will compile with bgxlc as well (it is C + bgxlc-style intrinsics).
> The performance when compiling with bgclang is *much* better,
> however, so I recommend that you use bgclang (and, at ALCF, you can
> use my build as detailed above). Regardless, the same libraries can
> be used from either compiler. SLEEF itself is from:
> http://shibatch.sourceforge.net/
> 
>  -Hal
> 
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> _______________________________________________
> llvm-bgq-discuss mailing list
> llvm-bgq-discuss at lists.alcf.anl.gov
> https://lists.alcf.anl.gov/mailman/listinfo/llvm-bgq-discuss
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory


More information about the llvm-bgq-discuss mailing list