Queue

You need a queue to schedule the execution of any computation or data copying on a device.

The queue construction requires specifying:

  • Device

  • Context targeting the device

  • Additional properties, such as: * If profiling information should be collected * If submitted tasks are executed in the order, in which they are submitted

The dpctl.SyclQueue class represents a queue and abstracts the sycl::queue SYCL runtime class.

Types of Queues

SYCL has a task-based execution model. The order, in which a SYCL runtime executes a task on a target device, is specified by a sequence of events that must be completed before the execution of the task is allowed.

Submission of a task returns an event that you can use to further grow the graph of computational tasks. A SYCL queue stores the needed data to manage the scheduling operations.

There are two types of queues:

  • Out-of-order. Unless specified otherwise during the constriction of a queue, a SYCL runtime executes tasks, which dependencies are met in an unspecified order, with the possibility for some of the tasks to be executed concurrently.

  • In-order. You can specify SYCL queues to indicate that runtime must execute tasks in the order, in which they are submitted. In this case, tasks submitted to such a queue are never executed concurrently.

Creating a New Queue

dpctl.SyclQueue(ctx, dev, property=None) creates a new queue instance for the given compatible context and device.

To create the in-order queue, set a keyword parametr to in_order

To dynamically collect task execution statistics in the returned event once the associated task completes, set a keyword parametr to enable_profiling.

Constructing SyclQueue from context and device
 1import dpctl
 2
 3
 4def create_queue_from_subdevice_multidevice_context():
 5    """
 6    Create a queue from a sub-device.
 7    """
 8    cpu_d = dpctl.SyclDevice("opencl:cpu:0")
 9    try:
10        sub_devs = cpu_d.create_sub_devices(partition=2)
11    except dpctl.SyclSubDeviceCreationError:
12        print("Could not create sub device.")
13        print(f"{cpu_d} has {cpu_d.max_compute_units} compute units")
14        return
15    ctx = dpctl.SyclContext(sub_devs)
16    q = dpctl.SyclQueue(ctx, sub_devs[0], partition="enable_profiling")
17    print(
18        "Number of devices in SyclContext " "associated with the queue: ",
19        q.sycl_context.device_count,
20    )
21

A possible output for the Constructing SyclQueue from context and device example:

INFO: Executing example create_queue_from_subdevice_multidevice_context
Could not create sub device.
<dpctl.SyclDevice [backend_type.opencl, device_type.cpu,  Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz] at 0x7f110e1bdbf0> has 2 compute units
INFO: ===========================

When a context is not specified, the sycl::queue constructor from a device instance is called. Instead of an instance of dpctl.SyclDevice the argument dev can be a valid filter selector string. In this case, the sycl::queue constructor with the corresponding sycl::ext::oneapi::filter_selector is called.

Constructing SyclQueue from filter selector
 1import dpctl
 2
 3
 4def create_queue_from_filter_selector():
 5    """Create queue for a GPU device or,
 6    if it is not available, for a CPU device.
 7
 8    Create in-order queue with profilign enabled.
 9    """
10    q = dpctl.SyclQueue("gpu,cpu", property=("in_order", "enable_profiling"))
11    print("Queue {} is in order: {}".format(q, q.is_in_order))
12    # display the device used
13    print("Device targeted by the queue:")
14    q.sycl_device.print_device_info()

A possible output for the Constructing SyclQueue from filter selector example:

INFO: Executing example create_queue_from_filter_selector
Queue <dpctl.SyclQueue at 0x7f4006692700, property=['in_order', 'enable_profiling']> is in order: True
Device targeted by the queue:
    Name            Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
    Driver version  2023.16.7.0.21_160000
    Vendor          Intel(R) Corporation
    Filter string   opencl:cpu:0

INFO: ===========================

Profiling a Task Submitted to a Queue

The result of scheduling the execution of a task on a queue is an event. You can use an event for several purposes:

  • Query for the status of the task execution

  • Order execution of future tasks after it is completed

  • Wait for execution to complete

  • Сarry information to profile the task execution

The profiling information is only populated if the queue used is created with the enable_profiling property and only becomes available after the task execution is complete.

The dpctl.SyclTimer class implements a Python context manager. You can use this context manager to collect cumulative profiling information for all the tasks submitted to the queue of interest by functions executed within the context:

Example of timing execution
import dpctl import dpctl.tensor as dpt

q = dpctl.SyclQueue(property="enable_profiling") timer_ctx =
dpctl.SyclTimer() with timer_ctx(q):
    X = dpt.arange(10**6, dtype=float, sycl_queue=q)

host_dt, device_dt = timer_ctx.dt

The timer leverages oneAPI enqueue_barrier SYCL extension and submits a barrier at context entrance and a barrier at context exit and records associated events. The elapsed device time is computed as e_exit.profiling_info_start - e_enter.profiling_info_end.