The base class for the LUT Spec template parameter. More...
#include <LUT.hpp>
Classes | |
struct | ThreeValue |
Three-value logic integer. More... | |
Static Public Member Functions | |
template<int Dummy = 0> | |
static constexpr uint32_t | getMaxInput () |
Get the value of the largest input in the lookup table. More... | |
template<int Dummy = 0> | |
static constexpr ThreeValue | getOutput (uint32_t input) |
Get the output that corresponds to a given input. More... | |
static constexpr bool | assumeInputInRange () |
Assume the input is in range. More... | |
static constexpr bool | allowBitwiseLUT () |
Allow using a large number of bitwise operations to build a LUT. More... | |
static constexpr int | maxMemoryLUTBytes () |
Allow generating a lookup table in memory in flat or constant address space. More... | |
static constexpr int | memoryLUTPackWidth () |
The width, in bits, to use for packed memory lookup tables. More... | |
static constexpr int | getVectorizationWidth () |
Get the width in bits of the integer-vector type to use for vectorized bitwise operations. More... | |
The base class for the LUT Spec template parameter.
This class (and its library and user defined subclasses) serve two functions: 1) They define the lookup table from input to output (
Subclasses of this class should define the functions that it wishes to override. Note that the final int Dummy = 0 template parameter exists only to make the static_assert conditional and is not required in overrides. All methods are intended to be overridden unless otherwise specified.
|
staticconstexpr |
|
staticconstexpr |
Assume the input is in range.
Returning true produces fewer instructions (and potentially smaller memory usage), but at the expense of completely undefined behaviour. Returning false produces an undefined result when accessing the LUT out of range, but otherwise has defined behaviour.
|
staticconstexpr |
Get the value of the largest input in the lookup table.
Lookups with a value larger than this (inputs are unsigned, so no negative numbers are possible) result in undefined behaviour if assumeInputInRange() is true, and an undefined return value otherwise.
|
staticconstexpr |
Get the output that corresponds to a given input.
This defines the lookup table that is to be implemented. The returned value is an integer represented with three-value-logic bits.
|
staticconstexpr |
Get the width in bits of the integer-vector type to use for vectorized bitwise operations.
Some LUT implementations, such as that enabled by allowBitwiseLUT(), work by converting the lookup table to an equivalent function using bitwise operations. This function specifies how wide those bitwise operations can be.
|
staticconstexpr |
Allow generating a lookup table in memory in flat or constant address space.
This is useful on CPUs, where memory caching is very good. on GPUs, this gets put in constant memory.
|
staticconstexpr |
The width, in bits, to use for packed memory lookup tables.
When a memory lookup table is packed, multiple elements can be stored in the same value loaded from memory. The return value of this method specifies how wide the value loaded from memory should be. If this is zero, then packing is not used. Packing is also not used if it is not possible to fit more than one output in an integer of the returned width.
Memory LUT packing is useful on GPU to reduce the probability that multiple addresses in the constant cache will have to be accessed. It's useful in general to make the lookup table smaller. The downside is that an extra bitwise operations (some of which are dependencies of the load address) have to be performed in order to extract a single value. On CPU, this is probably only worthwhile if the LUT is causing pressure on the L1 data cache.