These functions provide convenient access to the floating point instructions with special rounding modes. More...

Enumerations
enum	sp::RoundingMode { sp::NEAREST_EVEN , sp::ZERO , sp::NEG_INF , sp::POS_INF }
	Identifies a rounding mode. More...

Functions
template<RoundingMode RM, bool FTZ = true>
__device__ float	sp::add (float x, float y)
	FP add with configurable rounding mode and denormal handling. More...

template<RoundingMode RM, bool FTZ = true>
__device__ double	sp::add (double x, double y)
	FP add with configurable rounding mode and denormal handling. More...

template<RoundingMode RM, bool FTZ = true, typename T >
__device__ T	sp::sub (T x, T y)
	FP subtract with configurable rounding mode and denormal handling. More...

template<RoundingMode RM, bool FTZ = true>
__device__ float	sp::mul (float x, float y)
	FP multiply with configurable rounding mode and denorm handling. More...

template<bool FTZ = true>
__device__ float	sp::divApprox (float x, float y)
	FP divide, computed as `d = a * (1/b)`. More...

template<RoundingMode RM, bool FTZ = true>
__device__ float	sp::fma (float x, float y, float z)
	Full-range approximate FP divide. More...

template<RoundingMode RM, bool FTZ = true>
__device__ double	sp::fma (double x, double y, double z)
	FP multiply-add with configurable rounding mode and denorm handling. More...

template<RoundingMode RM, bool FTZ = true>
__device__ double	sp::mul (double x, double y)
	FP multiply with configurable rounding mode and denorm handling. More...

template<RoundingMode RM, bool FTZ = true>
__device__ float	sp::approxRcp (float x)
	FP approximate recirprocal with configurable rounding mode and denorm handling. More...

template<bool FTZ = true>
__device__ float	sp::approxSqrt (float x)

template<bool FTZ = true>
__device__ float	sp::approxRSqrt (float x)

template<bool FTZ = true>
__device__ float	sp::approxExp2 (float x)
	Compute a fast approximation to exp2(). More...

template<bool FTZ = true>
__device__ float	sp::approxLog2 (float x)
	Compute a fast approximation to Log2(). More...

template<bool FTZ = true>
__device__ float	sp::approxSin (float x)
	Compute a fast approximation to sin(). More...

template<bool FTZ = true>
__device__ float	sp::approxCos (float x)
	Compute a fast approximation to cos(). More...

Variables
constexpr bool	sp::FTZ = true
	Allows you to write eg. `sp::approxSqrt<FTZ>(x);` More...

constexpr bool	sp::NoFTZ = false

Detailed Description

These functions provide convenient access to the floating point instructions with special rounding modes.

While these operations are somewhat more clunky to use than the regular FP operators, they can be occasionally helpful when writing particularly accuracy-sensitive code.

Typically, there is no performance penalty for using these operators over their ordinary counterparts. You just get a different value.

Enumeration Type Documentation

◆ RoundingMode

enum sp::RoundingMode

Identifies a rounding mode.

Enumerator
NEAREST_EVEN	Round to nearest even (default)
ZERO	Round towards zero.
NEG_INF	Round to negative infinity.
POS_INF	Round towards positive infinity.

Function Documentation

◆ add() [1/2]

template<RoundingMode RM, bool FTZ = true>

__device__ double sp::add	(	double	x,
		double	y
	)

FP add with configurable rounding mode and denormal handling.

Template Parameters

RM	Rounding mode to use.
FTZ	If true, denormal inputs may be flushed to zero.

◆ add() [2/2]

template<RoundingMode RM, bool FTZ = true>

__device__ float sp::add	(	float	x,
		float	y
	)

FP add with configurable rounding mode and denormal handling.

TODO: Saturation is supported by this instruction.

Template Parameters

RM	Rounding mode to use.
FTZ	If true, denormal inputs may be flushed to zero.

◆ approxCos()

template<bool FTZ = true>

__device__ float sp::approxCos ( float x )

Compute a fast approximation to cos().

Template Parameters

RM	Rounding mode to use.
FTZ	If true, denormal inputs may be flushed to sign-preserving zero.

◆ approxExp2()

template<bool FTZ = true>

__device__ float sp::approxExp2 ( float x )

Compute a fast approximation to exp2().

Template Parameters

RM	Rounding mode to use.
FTZ	If true, denormal inputs may be flushed to sign-preserving zero.

◆ approxLog2()

template<bool FTZ = true>

__device__ float sp::approxLog2 ( float x )

Compute a fast approximation to Log2().

Template Parameters

RM	Rounding mode to use.
FTZ	If true, denormal inputs may be flushed to sign-preserving zero.

◆ approxRcp()

template<RoundingMode RM, bool FTZ = true>

__device__ float sp::approxRcp ( float x )

FP approximate recirprocal with configurable rounding mode and denorm handling.

Template Parameters

RM	Rounding mode to use.
FTZ	If true, denormal inputs may be flushed to zero.

◆ approxSin()

template<bool FTZ = true>

__device__ float sp::approxSin ( float x )

Compute a fast approximation to sin().

Template Parameters

RM	Rounding mode to use.
FTZ	If true, denormal inputs may be flushed to sign-preserving zero.

◆ divApprox()

template<bool FTZ = true>

__device__ float sp::divApprox	(	float	x,
		float	y
	)

FP divide, computed as d = a * (1/b).

For divisors between $$-2^{-126}$$ and $$-2^{126}$$, the maximum ULP error is 2.

◆ fma() [1/2]

template<RoundingMode RM, bool FTZ = true>

__device__ double sp::fma	(	double	x,
		double	y,
		double	z
	)

FP multiply-add with configurable rounding mode and denorm handling.

TODO: Saturation is supported by this instruction.

Computes (x * y) + z with no intermediate rounding.

Template Parameters

RM	Rounding mode to use for the final result.
FTZ	If true, denormal inputs may be flushed to zero.

◆ fma() [2/2]

template<RoundingMode RM, bool FTZ = true>

__device__ float sp::fma	(	float	x,
		float	y,
		float	z
	)

Full-range approximate FP divide.

A fast, full-range approximation to divide. FP multiply-add with configurable rounding mode and denorm handling.

TODO: Saturation is supported by this instruction.

Computes (x * y) + z with no intermediate rounding.

Template Parameters

RM	Rounding mode to use for the final result.
FTZ	If true, denormal inputs may be flushed to zero.

◆ mul() [1/2]

template<RoundingMode RM, bool FTZ = true>

__device__ double sp::mul	(	double	x,
		double	y
	)

FP multiply with configurable rounding mode and denorm handling.

Template Parameters

RM	Rounding mode to use.
FTZ	If true, denormal inputs may be flushed to zero.

◆ mul() [2/2]

template<RoundingMode RM, bool FTZ = true>

__device__ float sp::mul	(	float	x,
		float	y
	)

FP multiply with configurable rounding mode and denorm handling.

TODO: Saturation is supported by this instruction.

Template Parameters

RM	Rounding mode to use.
FTZ	If true, denormal inputs may be flushed to zero.

◆ sub()

template<RoundingMode RM, bool FTZ = true, typename T >

__device__ T sp::sub	(	T	x,
		T	y
	)

FP subtract with configurable rounding mode and denormal handling.

TODO: Saturation is supported by this instruction.

Template Parameters

RM	Rounding mode to use.
FTZ	If true, denormal inputs may be flushed to zero.

Variable Documentation

◆ FTZ

constexpr bool sp::FTZ = true

constexpr

Allows you to write eg. sp::approxSqrt<FTZ>(x);

Enumerations

Functions

Variables

Detailed Description

Enumeration Type Documentation

◆ RoundingMode

Function Documentation

◆ add() [1/2]

◆ add() [2/2]

◆ approxCos()

◆ approxExp2()

◆ approxLog2()

◆ approxRcp()

◆ approxSin()

◆ divApprox()

◆ fma() [1/2]

◆ fma() [2/2]

◆ mul() [1/2]

◆ mul() [2/2]

◆ sub()

Variable Documentation

◆ FTZ