tf::cudaScopedPerThreadStream class

class that provides RAII-styled guard of stream acquisition

Sample usage:

{
  tf::cudaScopedPerThreadStream stream(1);  // acquires a stream on device 1

  // use stream as a normal cuda stream (cudaStream_t)
  cudaStreamWaitEvent(stream, ...);

}  // leaving the scope releases the stream back to the pool on device 1

The scoped per-thread stream is primarily used by tf::Executor to execute CUDA tasks (e.g., tf::cudaFlow, tf::cudaFlowCapturer).

cudaScopedPerThreadStream is non-copyable.

Constructors, destructors, conversion operators

cudaScopedPerThreadStream(int device) explicit
constructs a scoped stream under the given device
cudaScopedPerThreadStream()
constructs a scoped stream under the current device.
~cudaScopedPerThreadStream()
destructs the scoped stream guard
operator cudaStream_t() const
implicit conversion to the native CUDA stream (cudaStream_t)
cudaScopedPerThreadStream(const cudaScopedPerThreadStream&) deleted
disabled copy constructor
cudaScopedPerThreadStream(cudaScopedPerThreadStream&&) defaulted
default move constructor

Public functions

auto operator=(const cudaScopedPerThreadStream&) -> cudaScopedPerThreadStream& deleted
disabled copy assignment
auto operator=(cudaScopedPerThreadStream&&) -> cudaScopedPerThreadStream& deleted
default move assignment

Function documentation

tf::cudaScopedPerThreadStream::cudaScopedPerThreadStream(int device) explicit

constructs a scoped stream under the given device

Parameters
device device context of the requested stream

The constructor acquires a stream from a per-thread stream pool.

tf::cudaScopedPerThreadStream::cudaScopedPerThreadStream()

constructs a scoped stream under the current device.

The constructor acquires a stream from a per-thread stream pool.

tf::cudaScopedPerThreadStream::~cudaScopedPerThreadStream()

destructs the scoped stream guard

The destructor releases the stream to the per-thread stream pool.