libSCALE  0.2.0
A modern C++ CUDA API
sp::CudaKernel Class Reference

Object representing a kernel. More...

#include <CudaKernel.hpp>

Public Member Functions

 operator const void * () const
 Implicitly convert to void* so this works in CUDA API functions. More...
 
cudaFuncAttributes getAttributes () const
 Query function attributes. More...
 
void setAttribute (cudaFuncAttribute attr, int value) const
 
void setCacheConfig (cudaFuncCache config)
 Set the cache mode. More...
 
void setSharedMemConfig (cudaSharedMemConfig config)
 Set the shared memory mode. More...
 
int getMaxActiveBlocksPerSM (int blockSize, size_t dynamicSMemSize)
 
int getMaxActiveBlocksPerSMIgnoreCaching (int blockSize, size_t dynamicSmemSize)
 

Static Public Member Functions

static CudaKernelget (const void *func)
 Get the global object representing the given kernel. More...
 
template<typename... Args>
static CudaKernelget (void func(Args...))
 

Detailed Description

Object representing a kernel.

A single, global Kernel object exists for every GPU kernel in the program. This object stores the per-kernel state that can be manipulated via the API, such as cache mode, shared memory bank size, and so on.

This object's constructors are private. Use CudaKernel::get() for initialisation.

When doing kernel launches there are several options to choose from.

See also
Stream::launchKernel

Member Function Documentation

◆ get()

static CudaKernel & sp::CudaKernel::get ( const void *  func)
static

Get the global object representing the given kernel.

◆ getAttributes()

cudaFuncAttributes sp::CudaKernel::getAttributes ( ) const

Query function attributes.

See also
cudaFuncGetAttributes()

◆ getMaxActiveBlocksPerSM()

int sp::CudaKernel::getMaxActiveBlocksPerSM ( int  blockSize,
size_t  dynamicSMemSize 
)

◆ getMaxActiveBlocksPerSMIgnoreCaching()

int sp::CudaKernel::getMaxActiveBlocksPerSMIgnoreCaching ( int  blockSize,
size_t  dynamicSmemSize 
)

◆ operator const void *()

sp::CudaKernel::operator const void * ( ) const

Implicitly convert to void* so this works in CUDA API functions.

◆ setCacheConfig()

void sp::CudaKernel::setCacheConfig ( cudaFuncCache  config)

Set the cache mode.

See also
cudaFuncSetCacheConfig()

◆ setSharedMemConfig()

void sp::CudaKernel::setSharedMemConfig ( cudaSharedMemConfig  config)

Set the shared memory mode.

See also
cudaFuncSetSharedMemConfig