Speclib  0.1.2
The library for writing better CUDA libraries
sp::OutputExpr< OutType, ExprType > Struct Template Reference

An expression, and a place to write it to. More...

#include <OutputExpr.hpp>

Public Types

using Expr = ExprType
 
using Opts = typename sp::MergeTensorOpts< typename OutType::Opts, typename ExprType::Opts >::type
 
using ValueType = typename OutType::ValueType
 ValueType in ExprType is the type of the result of the expression In the case of int8_t and int16_t sources, those sources are promoted to int32_t AKA int, and ValueType is set to int This can cause type conversion issues when wanting to output to int8_t or int16_t vectors More...
 

Public Member Functions

 OutputExpr (const OutType &output, const ExprType &expr)
 
template<int VectorSize>
void fillIntermediate (const sp::Vec< int, InRank > &index)
 Compute a vector of values of the expression and cache them in intermediate. More...
 
template<int VectorSize>
void flushIntermediate (const sp::Vec< int, InRank > &index)
 Flush a previously-stored intermediate value to the output. More...
 

Public Attributes

OutType output
 
Expr expr
 

Static Public Attributes

constexpr static int MaxVectorSize = sp::getMaxGMemVectorSize<ValueType>()
 
constexpr static int InRank = ExprType::Rank
 
constexpr static int OutRank = OutType::Rank
 

Detailed Description

template<typename OutType, typename ExprType>
struct sp::OutputExpr< OutType, ExprType >

An expression, and a place to write it to.

This is a very small decorator that holds a TensorExpr and an output object (a TensorLike). All this does is provides a small cache for intermediate results. The idea is that if you want to compute several expressions in parallel (which may depend on each other), you'd do it by calling fillIntermediate() on all of them, followed by calling flushIntermediate() on all of them.

The most simple example of how this is useful is swap(), which is implemented as two concurrently-evaluated identity expressions with different output objects.

Template Parameters
ExprTypeThe expression type.
OutTypeThe type of output object.

Member Typedef Documentation

◆ ValueType

template<typename OutType , typename ExprType >
using sp::OutputExpr< OutType, ExprType >::ValueType = typename OutType::ValueType

ValueType in ExprType is the type of the result of the expression In the case of int8_t and int16_t sources, those sources are promoted to int32_t AKA int, and ValueType is set to int This can cause type conversion issues when wanting to output to int8_t or int16_t vectors

Member Function Documentation

◆ fillIntermediate()

template<typename OutType , typename ExprType >
template<int VectorSize>
void sp::OutputExpr< OutType, ExprType >::fillIntermediate ( const sp::Vec< int, InRank > &  index)

Compute a vector of values of the expression and cache them in intermediate.

◆ flushIntermediate()

template<typename OutType , typename ExprType >
template<int VectorSize>
void sp::OutputExpr< OutType, ExprType >::flushIntermediate ( const sp::Vec< int, InRank > &  index)

Flush a previously-stored intermediate value to the output.