Speclib  0.1.2
The library for writing better CUDA libraries
sp::NomadicTensor< TensorType, Args > Class Template Reference

Represents a TensorLike that can be synchronised between the CPU and various GPUs. More...

#include <NomadicTensor.hpp>

Public Types

using ValueType = typename TensorType::ValueType
 The type of each element in the underlying TensorLike More...
 
using DeviceTensor = typename TensorType::template InValueType< __device const ValueType >
 Type describing a device-resident immutable view of the underlying TensorLike More...
 
using MutableDeviceTensor = typename TensorType::template InValueType< __device ValueType >
 Type describing a device-resident mutable view of the underlying TensorLike More...
 
using HostTensor = typename TensorType::template InValueType< const ValueType >
 Type describing a host-resident immutable view of the underlying TensorLike More...
 
using MutableHostTensor = typename TensorType::template InValueType< ValueType >
 Type describing a host-resident mutable view of the underlying TensorLike More...
 

Public Member Functions

 NomadicTensor ()=default
 Create an uninitialised NomadicTensor. More...
 
 NomadicTensor (const TensorType, Args ... args)
 Construct a NomadicTensor More...
 
void setMemoryType (HostMemoryType hostFlags)
 Reinitialise the buffer with new flags. More...
 
void reconfigure (Args... newArgs)
 Change the logical size of the underlying buffer. More...
 
void reset ()
 Forget the contents of the buffer so it may be reused for something else. More...
 
DeviceTensor deviceTensor (sp::Stream &s, bool move=false)
 Get an immutable copy of the TensorLike on the device associated with stream s. More...
 
MutableDeviceTensor mutableDeviceTensor (sp::Stream &s, bool move=false)
 Get a mutable copy of the TensorLike on the device associated with stream s. More...
 
HostTensor hostTensor (sp::Stream &s, bool move=false)
 Get an immutable copy of the TensorLike on the device associated with stream s. More...
 
MutableHostTensor mutableHostTensor (sp::Stream &s, bool move=false)
 Get a mutable copy of the TensorLike on the host. More...
 
MutableHostTensor mutableHostTensor ()
 A special version of mutableHostTensor() for doing host-side initialisation of the buffer. More...
 

Static Public Attributes

constexpr static int Rank = TensorType::Rank
 

Protected Attributes

std::shared_ptr< sp::NomadicBuffer< ValueType > > data
 
std::tuple< Args... > args
 

Detailed Description

template<typename TensorType, typename... Args>
class sp::NomadicTensor< TensorType, Args >

Represents a TensorLike that can be synchronised between the CPU and various GPUs.

This is conceptually very similar to a NomadicBuffer, except instead of a bare buffer you've got a specific TensorLike.

Example:

int len = 1024;
auto myData = sp::NomadicTensor{sp::Vector<int>{}, len};
auto stream = sp::Device::getActive(); // Get a stream to enqueue actions
auto hostSide = in.mutableHostTensor(stream); // Get a host-modifiable view of myData
// Do some modifications
for(int i = 0; i < len; i++) {
hostSide[i] = foo(i);
}
auto deviceSide = in.mutableDeviceTensor(stream); // Get a device-side view, transferring the data
myDeviceFunction(deviceSide); // Do some computation
auto result = in.hostTensor(stream); // Copy the results back
static Device & getActive()
Represents a TensorLike that can be synchronised between the CPU and various GPUs.
Definition: NomadicTensor.hpp:36
Represents a Tensor- a multidimensional array that can represent a multilinear map.
Definition: Tensor.hpp:32
See also
NomadicBuffer

Member Typedef Documentation

◆ DeviceTensor

template<typename TensorType , typename... Args>
using sp::NomadicTensor< TensorType, Args >::DeviceTensor = typename TensorType::template InValueType<__device const ValueType>

Type describing a device-resident immutable view of the underlying TensorLike

◆ HostTensor

template<typename TensorType , typename... Args>
using sp::NomadicTensor< TensorType, Args >::HostTensor = typename TensorType::template InValueType<const ValueType>

Type describing a host-resident immutable view of the underlying TensorLike

◆ MutableDeviceTensor

template<typename TensorType , typename... Args>
using sp::NomadicTensor< TensorType, Args >::MutableDeviceTensor = typename TensorType::template InValueType<__device ValueType>

Type describing a device-resident mutable view of the underlying TensorLike

◆ MutableHostTensor

template<typename TensorType , typename... Args>
using sp::NomadicTensor< TensorType, Args >::MutableHostTensor = typename TensorType::template InValueType<ValueType>

Type describing a host-resident mutable view of the underlying TensorLike

◆ ValueType

template<typename TensorType , typename... Args>
using sp::NomadicTensor< TensorType, Args >::ValueType = typename TensorType::ValueType

The type of each element in the underlying TensorLike

Constructor & Destructor Documentation

◆ NomadicTensor() [1/2]

template<typename TensorType , typename... Args>
sp::NomadicTensor< TensorType, Args >::NomadicTensor ( )
default

Create an uninitialised NomadicTensor.

◆ NomadicTensor() [2/2]

template<typename TensorType , typename... Args>
sp::NomadicTensor< TensorType, Args >::NomadicTensor ( const  TensorType,
Args ...  args 
)

Construct a NomadicTensor

This operation does not allocate any buffers for the underlying TensorLike. This merely initialises the buffer state machine and computes the size. Actual allocation occurs the first time you ask for a concrete view on some device (or the host).

The slightly odd signature of this constructor is an due to the nonexistnece of partial class template deduction in C++17. Typical usage will pass a default-constructed TensorLike of the desired type as the first argument, followed by the constructor arguments that should be used when constructing actual instances.

Example

// A NomadicTensor<sp::Vector> (tparam deduced) that will construct `sp::Vector<int>{42}` when necessary.
42
};

Member Function Documentation

◆ deviceTensor()

template<typename TensorType , typename... Args>
DeviceTensor sp::NomadicTensor< TensorType, Args >::deviceTensor ( sp::Stream s,
bool  move = false 
)

Get an immutable copy of the TensorLike on the device associated with stream s.

If the target device does not already have an up-to-date copy of the buffer, a copy is enqueued on stream s to create one. This function always returns immediately, yielding a TensorLike backed by the buffer on the target device. It will only be safe to read from the returned object after the enqueued copy (if any) has completed.

Typically, you'll be passing the returned object to a kernel which you can simply queue up on the same stream, guaranteeing that the copy completes before any read.

Parameters
sThe stream to use for any copy, and which identifies the device to synchronise to.
moveIf true, deallocate all copies of this buffer on devices except the target (including the host)
See also
NomadicBuffer::deviceTensor()

◆ hostTensor()

template<typename TensorType , typename... Args>
HostTensor sp::NomadicTensor< TensorType, Args >::hostTensor ( sp::Stream s,
bool  move = false 
)

Get an immutable copy of the TensorLike on the device associated with stream s.

If the host does not already have an up-to-date copy of the buffer, a copy is enqueued on stream s to create one. This function always returns immediately, yielding a TensorLike backed by the host buffer. It will only be safe to read from the returned object after the enqueued copy (if any) has completed.

sp::Device::launchHostFunc() provides a convenient and efficient way to enqueue host-side work using the buffer that waits until after the copy. You can queue up further device-side work after the host function to avoid stalling the stream after the host function completes while you launch something else.

Parameters
sThe stream to use for any copy.
moveIf true, deallocate all other copies of the object.
See also
sp::NomadicBuffer::deviceTensor()
sp::Device::launchHostFunc()

◆ mutableDeviceTensor()

template<typename TensorType , typename... Args>
MutableDeviceTensor sp::NomadicTensor< TensorType, Args >::mutableDeviceTensor ( sp::Stream s,
bool  move = false 
)

Get a mutable copy of the TensorLike on the device associated with stream s.

As a result of this call, the target device will be considered to have modified the object. Any future requests for access to the object on devices other than this one (or the host) will cause a copy from this device.

If the target device does not already have an up-to-date copy of the buffer, a copy is enqueued on stream s to create one. This function always returns immediately, yielding a TensorLike backed by the buffer on the target device. It will only be safe to read from the returned object after the enqueued copy (if any) has completed.

Typically, you'll be passing the returned object to a kernel which you can simply queue up on the same stream, guaranteeing that the copy completes before any read.

Parameters
sThe stream to use for any copy, and which identifies the device to synchronise to.
moveIf true, deallocate all copies of this buffer on devices except the target (including the host)
See also
NomadicBuffer::deviceTensor()

◆ mutableHostTensor() [1/2]

template<typename TensorType , typename... Args>
MutableHostTensor sp::NomadicTensor< TensorType, Args >::mutableHostTensor ( )

A special version of mutableHostTensor() for doing host-side initialisation of the buffer.

This allows you to access the host side of the buffer without using an sp::Stream. This is primarily useful for populating a NomadicTensor from the host before doing anything with the GPU.

See also
NomadicBuffer::mutableHostTensor()

◆ mutableHostTensor() [2/2]

template<typename TensorType , typename... Args>
MutableHostTensor sp::NomadicTensor< TensorType, Args >::mutableHostTensor ( sp::Stream s,
bool  move = false 
)

Get a mutable copy of the TensorLike on the host.

As a result of this call, the host device will be considered to have modified the object. Any future requests for access to the object outside the host will cause a host->device copy.

If the host does not already have an up-to-date copy of the buffer, a copy is enqueued on stream s to create one. This function always returns immediately, yielding a TensorLike backed by the host buffer. It will only be safe to read from the returned object after the enqueued copy (if any) has completed.

sp::Device::launchHostFunc() provides a convenient and efficient way to enqueue host-side work using the buffer that waits until after the copy. You can queue up further device-side work after the host function to avoid stalling the stream after the host function completes while you launch something else.

Parameters
sThe stream to use for any copy.
moveIf true, deallocate all other copies of the object.
See also
sp::NomadicBuffer::deviceTensor()
sp::Device::launchHostFunc()

◆ reconfigure()

template<typename TensorType , typename... Args>
void sp::NomadicTensor< TensorType, Args >::reconfigure ( Args...  newArgs)

Change the logical size of the underlying buffer.

The allocations are not changed, but the TensorLike returned by future view calls will reflect construction with the given new set of arguments. Making the result not go out of bounds is left as an exercise for the reader.

◆ reset()

template<typename TensorType , typename... Args>
void sp::NomadicTensor< TensorType, Args >::reset ( )

Forget the contents of the buffer so it may be reused for something else.

See also
NomadicBuffer::reset().

◆ setMemoryType()

template<typename TensorType , typename... Args>
void sp::NomadicTensor< TensorType, Args >::setMemoryType ( HostMemoryType  hostFlags)

Reinitialise the buffer with new flags.

This will destroy the contents of the buffer.

It's anticipated that this only be using during initiation, where speed is less of a concern.