Speclib  0.1.2
The library for writing better CUDA libraries
sp::TensorLike< Subclass, TensorRank > Class Template Reference

The interface exposed by anything that is "tensor like". More...

#include <TensorLike.hpp>

Inheritance diagram for sp::TensorLike< Subclass, TensorRank >:
[legend]

Public Member Functions

auto dims () const
 
bool boundsCheck (const Vec< int, Rank > &pos) const
 
template<int L, CacheMode Mode = CacheMode::DEFAULT>
auto vectorRead (const Vec< int, Rank > &pos) const
 
template<int L, CacheMode Mode = CacheMode::DEFAULT>
auto vectorOffsetRead (const Vec< int, Rank > &base, const Vec< int, Rank > &offset) const
 
template<int L, CacheMode Mode = CacheMode::DEFAULT>
auto maskedVectorRead (const Vec< int, Rank > &pos) const
 
template<int L, CacheMode Mode = CacheMode::DEFAULT, typename T >
void vectorWrite (const Vec< int, Rank > &pos, const Vec< T, L > &values)
 
template<int L, CacheMode Mode = CacheMode::DEFAULT, typename T >
void maskedVectorWrite (const Vec< int, Rank > &pos, const Vec< T, L > &values)
 
int sizeQuantisation () const
 
int totalSize () const
 Get the total memory occupied by the Tensor, in terms of elements. More...
 
auto getView (const sp::Vec< int, Rank > &start, const sp::Vec< int, Rank > &size)
 Get an object that represents (and aliases) a portion of this object. More...
 
int dim (int d) const
 Behaviour common to all TensorLikes ///. More...
 
template<CacheMode Mode = CacheMode::DEFAULT>
auto read (const Vec< int, Rank > &pos) const
 Get the element at a given position. More...
 
template<CacheMode Mode = CacheMode::DEFAULT, typename T >
void write (const Vec< int, Rank > &pos, const T &value)
 Set a single element. More...
 
template<int Dummy = 0>
int size () const
 
template<int L>
void boundsCheckAccess (Vec< int, Rank > pos) const
 Bounds-check an L-element vector read at pos. More...
 

Static Public Attributes

constexpr static int Rank = TensorRank
 

Protected Member Functions

template<typename Dummy = void*>
auto dimsImpl () const
 Get an sp::Vec<int, X> representing the dimensions of the object. More...
 
bool boundsCheckImpl (const sp::Vec< int, Rank > &) const
 Return true iff the given coordinates are inside the object (Default implementation below) More...
 
int sizeQuantisationImpl () const
 The last dimension is rounded up to the next multiple of this value for the purposes of bounds checks. More...
 
template<int L, CacheMode Mode = CacheMode::DEFAULT>
auto vectorReadImpl (const Vec< int, Rank > &) const
 Read L elements, adjacent in the last dimension, starting at the given position. More...
 
template<int L, CacheMode Mode = CacheMode::DEFAULT>
auto vectorOffsetReadImpl (const Vec< int, Rank > &base, const Vec< int, Rank > &offset) const
 Read L elements, adjacent in the last dimension, from position base + offset. More...
 
template<int L, CacheMode Mode = CacheMode::DEFAULT>
auto maskedVectorReadImpl (const Vec< int, Rank > &pos) const
 Do a vectorRead that copes with the possibility of part of the vector being out of bounds. More...
 
template<int L, CacheMode Mode = CacheMode::DEFAULT, typename T >
void maskedVectorWriteImpl (const Vec< int, Rank > &pos, const Vec< T, L > &values)
 Do a vectorWrite that copes with the possibility of part of the vector being out of bounds. More...
 
template<int L, CacheMode Mode, typename T >
void vectorWriteImpl (const Vec< int, Rank > &pos, const Vec< T, L > &values)
 Write L elements, adjacent in the last dimension, starting at the given position. More...
 
auto getViewImpl (const sp::Vec< int, Rank > &start, const sp::Vec< int, Rank > &size)
 The default implementation just makes a TensorView. Not hugely fast, but always works. More...
 

Detailed Description

template<typename Subclass, int TensorRank>
class sp::TensorLike< Subclass, TensorRank >

The interface exposed by anything that is "tensor like".

This is a curiously recurrent base class that provides an interface and some very basic behaviours for anything that is "tensor-ey".

Examples include Vector, TriangularMatrix, Tensor, or any TensorExpr.

Template Parameters
SubclassCuriously-recurring template type parameter.
TensorRankThe Rank of the tensor.

Member Function Documentation

◆ boundsCheckAccess()

template<typename Subclass , int TensorRank>
template<int L>
void sp::TensorLike< Subclass, TensorRank >::boundsCheckAccess ( Vec< int, Rank >  pos) const

Bounds-check an L-element vector read at pos.

Implemented using the single-position boundsCheck function, so subclasses need only make that work, and vectorised bounds checks will work correctly.

Assert-fails on failure. No-op in release builds. Sprinkle through your implementations to taste.

◆ boundsCheckImpl()

template<typename Subclass , int Rank>
bool sp::TensorLike< Subclass, Rank >::boundsCheckImpl ( const sp::Vec< int, Rank > &  pos) const
protected

Return true iff the given coordinates are inside the object (Default implementation below)

Generic implementation of boundsCheck in terms of dims() and sizeQuantisation().

◆ dim()

template<typename Subclass , int TensorRank>
int sp::TensorLike< Subclass, TensorRank >::dim ( int  d) const

Behaviour common to all TensorLikes ///.

Get the d'th dimension.

◆ dimsImpl()

template<typename Subclass , int TensorRank>
template<typename Dummy = void*>
auto sp::TensorLike< Subclass, TensorRank >::dimsImpl ( ) const
protected

Get an sp::Vec<int, X> representing the dimensions of the object.

◆ getView()

template<typename Subclass , int TensorRank>
auto sp::TensorLike< Subclass, TensorRank >::getView ( const sp::Vec< int, Rank > &  start,
const sp::Vec< int, Rank > &  size 
)

Get an object that represents (and aliases) a portion of this object.

◆ getViewImpl()

template<typename Subclass , int Rank>
auto sp::TensorLike< Subclass, Rank >::getViewImpl ( const sp::Vec< int, Rank > &  start,
const sp::Vec< int, Rank > &  size 
)
protected

The default implementation just makes a TensorView. Not hugely fast, but always works.

◆ maskedVectorReadImpl()

template<typename Subclass , int TensorRank>
template<int L, CacheMode Mode = CacheMode::DEFAULT>
auto sp::TensorLike< Subclass, TensorRank >::maskedVectorReadImpl ( const Vec< int, Rank > &  pos) const
protected

Do a vectorRead that copes with the possibility of part of the vector being out of bounds.

This hook exists for things like texture-backed tensors to implement this cheaply. The default implementation just scalarises and is usually quite inefficient.

◆ maskedVectorWriteImpl()

template<typename Subclass , int TensorRank>
template<int L, CacheMode Mode = CacheMode::DEFAULT, typename T >
void sp::TensorLike< Subclass, TensorRank >::maskedVectorWriteImpl ( const Vec< int, Rank > &  pos,
const Vec< T, L > &  values 
)
protected

Do a vectorWrite that copes with the possibility of part of the vector being out of bounds.

This hook exists for things like texture-backed tensors to implement this cheaply. The default implementation just scalarises and is usually quite inefficient.

◆ read()

template<typename Subclass , int TensorRank>
template<CacheMode Mode = CacheMode::DEFAULT>
auto sp::TensorLike< Subclass, TensorRank >::read ( const Vec< int, Rank > &  pos) const

Get the element at a given position.

◆ sizeQuantisationImpl()

template<typename Subclass , int Rank>
int sp::TensorLike< Subclass, Rank >::sizeQuantisationImpl
protected

The last dimension is rounded up to the next multiple of this value for the purposes of bounds checks.

Generic implementation of sizeQuantisation() that adds no padding at all.

This allows you to allocate padding (to facilitate vector loads, mostly) while still having bounds checks Since this is only used from the bounds checking code, it is unlikely to be used outside of debug builds unless some subclass does so.

◆ totalSize()

template<typename Subclass , int TensorRank>
int sp::TensorLike< Subclass, TensorRank >::totalSize ( ) const

Get the total memory occupied by the Tensor, in terms of elements.

◆ vectorOffsetReadImpl()

template<typename Subclass , int TensorRank>
template<int L, CacheMode Mode = CacheMode::DEFAULT>
auto sp::TensorLike< Subclass, TensorRank >::vectorOffsetReadImpl ( const Vec< int, Rank > &  base,
const Vec< int, Rank > &  offset 
) const
protected

Read L elements, adjacent in the last dimension, from position base + offset.

This API exists for situations where compilers don't currently understand how to convert your code into use of offset loads. Use of this API with the same base and an offset that can be converted to immediates typically leads to a nice speedup.

This is especially useful for leveraging texture offset instructions.

The default implementation just adds the coordinates together and calls vectorRead.

◆ vectorReadImpl()

template<typename Subclass , int TensorRank>
template<int L, CacheMode Mode = CacheMode::DEFAULT>
auto sp::TensorLike< Subclass, TensorRank >::vectorReadImpl ( const Vec< int, Rank > &  ) const
protected

Read L elements, adjacent in the last dimension, starting at the given position.

Implementors should use vector load instructions where possible.

◆ vectorWriteImpl()

template<typename Subclass , int TensorRank>
template<int L, CacheMode Mode, typename T >
void sp::TensorLike< Subclass, TensorRank >::vectorWriteImpl ( const Vec< int, Rank > &  pos,
const Vec< T, L > &  values 
)
protected

Write L elements, adjacent in the last dimension, starting at the given position.

Implementors should use vector store instructions where possible.

◆ write()

template<typename Subclass , int TensorRank>
template<CacheMode Mode = CacheMode::DEFAULT, typename T >
void sp::TensorLike< Subclass, TensorRank >::write ( const Vec< int, Rank > &  pos,
const T &  value 
)

Set a single element.