Speclib  0.1.2
The library for writing better CUDA libraries
API Support

Tools to help implement irritating C-style APIs. More...

Modules

 Enum Validation
 A general mechanism for checking if an enum value is in range.
 

Classes

class  sp::Context
 A class to hold library-global state. More...
 
struct  sp::PtrScalar< T, Location >
 Represents a scalar value that has been passed as a pointer. More...
 
struct  sp::is_ptr_scalar< typename >
 Type-trait for detecting if a given type is a PtrScalar of some sort. More...
 
class  sp::IndirectScalar< T >
 A self-dereferencing pointer. More...
 
struct  sp::is_indirect_scalar< typename >
 Type-trait for detecting if a given type is a PtrScalar of some sort. More...
 
struct  sp::CudaVariantScalar< Location >
 Similar to a PtrScalar, but with the added quirk that the type of the scalar is not known at compile-time. More...
 
struct  sp::is_variant_scalar< typename >
 Trait to detect if something is a CudaVariantScalar. More...
 
struct  sp::VariantOutputPtr
 An output pointer represented as a void* and a cudaDataType_t. More...
 
struct  sp::CudaVec< T, Size >
 Provides a type mapping between CUDA vector types and sp::Vec. More...
 
struct  sp::TensorDescriptor< Type, Rank >
 A fully dynamic entity that can be converted into a proper Tensor at runtime. More...
 

Macros

#define CUDA_VEC_E(TYPE, SIZE, CUDATYPE)
 
#define CUDA_VEC(TYPE, SIZE)   CUDA_VEC_E(TYPE, SIZE, TYPE ## SIZE)
 

Typedefs

template<typename T , int Size>
using sp::CudaVec_t = typename CudaVec< std::remove_cv_t< T >, Size >::type
 The CUDA vector type representing a vector of type T and length Size. Such as float4 or int2. More...
 
template<typename T , int Size>
using sp::ThinCudaVec_t = typename CudaVec< std::remove_cv_t< T >, Size >::thinType
 Like CudaVec_t, except that if Size is 1 this is equal to T. More...
 

Enumerations

enum class  sp::PtrLocation { HOST , DEVICE , UNKNOWN }
 Represents which device a pointer resides on, if known. More...
 

Functions

template<typename T >
void sp::outputToAddress (sp::Stream &s, __flat T *outputAddress, __device const T *stagingAddress)
 Conditionally copy a result back to the host. More...
 
template<typename T >
 sp::PtrScalar (__device const T *) -> PtrScalar< T, PtrLocation::DEVICE >
 
template<typename T >
cudaDataType_t sp::asCudaType ()
 Convert a C++ type to a CUDA type enumeration value. More...
 

Detailed Description

Tools to help implement irritating C-style APIs.

This is helpful when:

Macro Definition Documentation

◆ CUDA_VEC_E

#define CUDA_VEC_E (   TYPE,
  SIZE,
  CUDATYPE 
)
Value:
template<> \
struct CudaVec<TYPE, SIZE> { \
using type = CUDATYPE; \
};

Typedef Documentation

◆ CudaVec_t

template<typename T , int Size>
using sp::CudaVec_t = typedef typename CudaVec<std::remove_cv_t<T>, Size>::type

The CUDA vector type representing a vector of type T and length Size. Such as float4 or int2.

◆ ThinCudaVec_t

template<typename T , int Size>
using sp::ThinCudaVec_t = typedef typename CudaVec<std::remove_cv_t<T>, Size>::thinType

Like CudaVec_t, except that if Size is 1 this is equal to T.

Enumeration Type Documentation

◆ PtrLocation

enum class sp::PtrLocation
strong

Represents which device a pointer resides on, if known.

Function Documentation

◆ asCudaType()

template<typename T >
cudaDataType_t sp::asCudaType ( )

Convert a C++ type to a CUDA type enumeration value.

◆ outputToAddress()

template<typename T >
void sp::outputToAddress ( sp::Stream s,
__flat T *  outputAddress,
__device const T *  stagingAddress 
)

Conditionally copy a result back to the host.

Some APIs allow the output address to be either a host or device pointer. Kernels always produce results on the GPU, so we need to conditionally copy the result back to the device.

If they are the same address, nothing happens. Otherwise a copy (potentially between devices) occurs. This allows values to be returned to the host, or to other devices, after a library call.

Parameters
sThe stream to enqueue any copies on.
outputAddressThe address the caller wants the result to end up at.
stagingAddressThe GPU address where the result is due to be written.