clang-nvcc
The clang-nvcc
tool aims to present an interface that is
compatible with nvcc
,
in much the same way as clang-cl
aims to present an
interfac ethat is compatible with cl
.
This is useful for building projects that use the CUDA language but do
not support using clang++
to build device code without
modification.
constexpr
debugging
headerThe <constexpr_debug>
header contains tools for
debugging constexpr
code.
cedbg::abort()
This function can be called from a constexpr
function to
abort the compilation with a compiler error. This is useful for
implementing compile-time assertions on constexpr
code
local variables (to which static_assert
cannot be
applied).
cedbg::Out
The cedbg::Out
class is a std::ostream
-like
class to print to the compiler’s standard out from a
constexpr
function. It supports C-style strings and
integers.
For example:
#include <constexpr_debug>
constexpr int meow()
{
::Out() << "Meow :D\n";
cedbgreturn 0;
}
static_assert(meow() == 0);
Unlike std::cout
, there are no global instances of this
object. All instances of this class write to the same underlying
stream.
The cedbg::putc
and cedbg::puts
functions
exist to provide constexpr
equivalents of their C
counterparts.
constexpr
profilerThe compilation time needed to execute C++11 constexpr
functions can be profiled using
-fconstexpr-profile=profile.cg
. The resulting profile is a
callgrind
format profile that can be analyzed with existing callgrind profile
viewers such as kcachegrind
.
The profile shows execution time (in ms
) and
constexpr
steps (as limited by clang++
’s
-fconstexpr-steps=
argument).
-fmax-data-local-size
and -fmax-data-global-size
These flags produce an error if any local variable or global variable
is too large. This can detect large variables that might cause stack
overflow or that might be a consequence of unintended
constexpr
structures being instantiated in the generated
code.
Compiled GPU symbols can be converted to a hash to provide for a more compact binary. This also works around a bug in some GPU tools where very long symbol names (such as can arise from heavily templated code) cause the tool to crash.
It is enabled by default in SCALE mode (also the default), but can be
disabled with -fcuda-disable-symbol-hashing
. In non-SCALE
mode, it can be enabled with
-fno-cuda-disable-symbol-hashing
.