libSCALE  0.2.0
A modern C++ CUDA API
Device Intrinsics

These APIs provide direct (and usually non-portable) access to hardware features that are either not supported by the compiler (yet), or which manufacturers don't document. More...

Modules

 Texture Intrinsics
 Plugging some holes in the NVIDIA® texture fetch API, and making it less annoying to use in template code.
 

Detailed Description

These APIs provide direct (and usually non-portable) access to hardware features that are either not supported by the compiler (yet), or which manufacturers don't document.

It's usually not a great idea to use this stuff directly, but sometimes we have to use these things to get the best performance on a particular platform, or to work around an artificial limitation of the API.

Some of the more exotic features of speclib are implemented using these intrinsics in ways that add some portability.

These APIs should not be considered stable, the functions are liable to randomly not work on GPUs that we don't have in the test cluster, and over time we intend to gradually move the useful subset of these functions into the compiler (as optimisations, idiom-recognised intrinsics, or other methods that are more reliable).