cudaRoundRobinCapturing class
class to capture the described graph into a native cudaGraph using a greedy round-robin algorithm on a fixed number of streams
Contents
A round-robin capturing algorithm levelizes the user-described graph and assign streams to nodes in a round-robin order level by level.
Constructors, destructors, conversion operators
- cudaRoundRobinCapturing()
 - constructs a round-robin optimizer with 4 streams by default
 - cudaRoundRobinCapturing(size_t num_streams)
 - constructs a round-robin optimizer with the given number of streams
 
Public functions
- auto num_streams() const -> size_t
 - queries the number of streams used by the optimizer
 - void num_streams(size_t n)
 - sets the number of streams used by the optimizer