Speclib  0.1.2
The library for writing better CUDA libraries
Memory Manipulators

Functions for manipulating memory, such as less annoying versions of memset/malloc/etc. More...

Classes

class  sp::LazyGpuBuffer< T, Zero >
 A buffer that is lazily-allocated using double-checked locking. More...
 

Typedefs

template<typename T >
using sp::AlignedBytesFor = AlignedBytes< alignof(T), sizeof(T)>
 

Enumerations

enum class  sp::CacheMode { CacheMode::DEFAULT , CacheMode::STREAMING , CacheMode::SKIP_L1 , CacheMode::CONSTANT }
 

Functions

template<typename T >
constexpr int sp::getMaxGMemVectorSize ()
 Get the maximum vector instruction size provided for global memory operations of the given type. More...
 
template<typename T >
constexpr int sp::getMaxSMemVectorSize ()
 
template<typename T >
constexpr T sp::getGCDPowerOfTwo (T n)
 
template<typename T , int Size>
constexpr int sp::getGPUAlignment ()
 Get the alignment requirements of a sp::Vec<T, Size> on the GPU, in bytes. More...
 
template<typename T , typename PT >
bool sp::isAligned (PT ptr) noexcept
 Check if a pointer is aligned for a given datatype. More...
 
template<typename T , typename Q >
sp::bitcast (const Q &value)
 Perform a type-unsafe reinterpret cast, avoiding undefined behaviour. More...
 
template<typename T >
void sp::mallocDeleter (T *p)
 Deleter for things allocated with malloc(). More...
 
template<typename T >
void sp::arrayDeleter (T *p)
 A deleter for things allocated with new[]. More...
 
template<typename T >
__host__ char * sp::arrayAllocator (int n)
 Allocate memory for n items of type T More...
 
template<int L, CacheMode M = CacheMode::DEFAULT, typename T >
auto sp::vectorMemoryRead (const T *ptr)
 Read from a pointer using a vector load instruction, applying a caching mode hint if supported on the target. More...
 
template<typename ValueT , int L, CacheMode M = CacheMode::DEFAULT, typename PtrT >
void sp::vectorMemoryWrite (PtrT *ptr, sp::ThinVec< ValueT, L > values)
 Write to a pointer on the GPU using a vector load instruction. More...
 
template<typename T , int L>
__device__ void sp::vectorMemoryWrite (const __constant T *, sp::Vec< T, L >)
 

Detailed Description

Functions for manipulating memory, such as less annoying versions of memset/malloc/etc.

Enumeration Type Documentation

◆ CacheMode

enum class sp::CacheMode
strong
Enumerator
DEFAULT 

Default caching behaviour.

STREAMING 

Cache lines are flagged for first-eviction. Use for data that is read or written only once.

SKIP_L1 

Do not cache this load in L1, but do cache in L2 and elsewhere.

CONSTANT 

Use the non-coherent cache. Usually performs better if data is constant, explodes if not.

Function Documentation

◆ arrayAllocator()

template<typename T >
__host__ char * sp::arrayAllocator ( int  n)

Allocate memory for n items of type T

◆ arrayDeleter()

template<typename T >
void sp::arrayDeleter ( T *  p)

A deleter for things allocated with new[].

◆ bitcast()

template<typename T , typename Q >
T sp::bitcast ( const Q &  value)

Perform a type-unsafe reinterpret cast, avoiding undefined behaviour.

Equivalent to C++20 std::bit_cast.

See the manual for details.

◆ getGPUAlignment()

template<typename T , int Size>
constexpr int sp::getGPUAlignment ( )
constexpr

Get the alignment requirements of a sp::Vec<T, Size> on the GPU, in bytes.

◆ getMaxGMemVectorSize()

template<typename T >
constexpr int sp::getMaxGMemVectorSize ( )
constexpr

Get the maximum vector instruction size provided for global memory operations of the given type.

◆ isAligned()

template<typename T , typename PT >
bool sp::isAligned ( PT  ptr)
noexcept

Check if a pointer is aligned for a given datatype.

◆ mallocDeleter()

template<typename T >
void sp::mallocDeleter ( T *  p)

Deleter for things allocated with malloc().

◆ vectorMemoryRead()

template<int L, CacheMode M = CacheMode::DEFAULT, typename T >
auto sp::vectorMemoryRead ( const T *  ptr)

Read from a pointer using a vector load instruction, applying a caching mode hint if supported on the target.

Template Parameters
LLength of memory block to read. Lengths unsupported by the device will be automatically managed.
Parameters
ptrPointer to the intended start location in device memory.
Returns
Pointer to an sp::Vec of type T and length L cast from ptr.

◆ vectorMemoryWrite()

template<typename ValueT , int L, CacheMode M = CacheMode::DEFAULT, typename PtrT >
void sp::vectorMemoryWrite ( PtrT *  ptr,
sp::ThinVec< ValueT, L >  values 
)

Write to a pointer on the GPU using a vector load instruction.

Template Parameters
LLength of memory block to write. Lengths unsupported by the device will be automatically managed.