Speclib  0.1.2
The library for writing better CUDA libraries
sp::StaticKernel< Subclass, Grid > Class Template Reference

In the special case where your Grid definition is fully constexpr, correct behaviour requires you to pass the same values to launch(s, dynamicSmem) and to the launch-bounds tparam of launch, which is very irritating. More...

#include <Kernel.hpp>

Inheritance diagram for sp::StaticKernel< Subclass, Grid >:
[legend]

Public Member Functions

template<typename LaunchBoundsT = sp::LaunchBounds<Grid::threadsPerBlock(), 1>>
__host__ void launch (sp::Stream &stream, int dynamicSMem=0)
 
template<typename LaunchBoundsT = sp::LaunchBounds<MAX_THREADS_PER_BLOCK, 1>>
__host__ void launch (sp::Vec< int, Grid::GridRank > numBlocks, sp::Vec< int, Grid::BlockRank > numThreads, cudaStream_t stream, int dynamicSMem=0)
 
- Public Member Functions inherited from sp::Kernel< Subclass, Grid >
__device__ void operator() ()
 
__device__ bool isLastBlockHere (__device int *atomicCounter)
 A handy mechanism for determining if a block is the last one to reach a certain point. More...
 
__host__ void launch (sp::Vec< int, Grid::GridRank > numBlocks, sp::Vec< int, Grid::BlockRank > numThreads, sp::Stream &stream, int dynamicSMem=0)
 Do the kernel launch. More...
 
__host__ void launch (sp::Vec< int, Grid::GridRank > numBlocks, sp::Vec< int, Grid::BlockRank > numThreads, cudaStream_t stream, int dynamicSMem=0)
 Evil legacy wrapper that allows using bare streams. More...
 

Additional Inherited Members

- Public Types inherited from sp::Kernel< Subclass, Grid >
using Grid = Grid
 

Detailed Description

template<typename Subclass, typename Grid>
class sp::StaticKernel< Subclass, Grid >

In the special case where your Grid definition is fully constexpr, correct behaviour requires you to pass the same values to launch(s, dynamicSmem) and to the launch-bounds tparam of launch, which is very irritating.

In such cases, derive your kernel from StaticKernel instead and then you can simply call launch() and it will do the thing.

Note that you might still want to pass LaunchBounds yourself if you want to provide a different value for MinBlocksPerSM.