Object representing a kernel. More...
#include <CudaKernel.hpp>
Public Member Functions | |
operator const void * () const | |
Implicitly convert to void* so this works in CUDA API functions. More... | |
cudaFuncAttributes | getAttributes () const |
Query function attributes. More... | |
void | setAttribute (cudaFuncAttribute attr, int value) const |
void | setCacheConfig (cudaFuncCache config) |
Set the cache mode. More... | |
void | setSharedMemConfig (cudaSharedMemConfig config) |
Set the shared memory mode. More... | |
int | getMaxActiveBlocksPerSM (int blockSize, size_t dynamicSMemSize) |
int | getMaxActiveBlocksPerSMIgnoreCaching (int blockSize, size_t dynamicSmemSize) |
Static Public Member Functions | |
static CudaKernel & | get (const void *func) |
Get the global object representing the given kernel. More... | |
template<typename... Args> | |
static CudaKernel & | get (void func(Args...)) |
Object representing a kernel.
A single, global Kernel
object exists for every GPU kernel in the program. This object stores the per-kernel state that can be manipulated via the API, such as cache mode, shared memory bank size, and so on.
This object's constructors are private. Use CudaKernel::get()
for initialisation.
When doing kernel launches there are several options to choose from.
|
static |
Get the global object representing the given kernel.
cudaFuncAttributes sp::CudaKernel::getAttributes | ( | ) | const |
Query function attributes.
int sp::CudaKernel::getMaxActiveBlocksPerSM | ( | int | blockSize, |
size_t | dynamicSMemSize | ||
) |
int sp::CudaKernel::getMaxActiveBlocksPerSMIgnoreCaching | ( | int | blockSize, |
size_t | dynamicSmemSize | ||
) |
sp::CudaKernel::operator const void * | ( | ) | const |
Implicitly convert to void* so this works in CUDA API functions.
void sp::CudaKernel::setCacheConfig | ( | cudaFuncCache | config | ) |
Set the cache mode.
void sp::CudaKernel::setSharedMemConfig | ( | cudaSharedMemConfig | config | ) |
Set the shared memory mode.