numba_dpex.dpctl_iface.kernel_launch_ops module

class numba_dpex.dpctl_iface.kernel_launch_ops.KernelLaunchOps(lowerer, cres, num_inputs)

Bases: object

Defines a set of functions to launch a SYCL kernel on the “current queue” as defined in the dpctl queue manager.

allocate_kernel_arg_array(num_kernel_args)

Allocates an array to store the LLVM Value for every kernel argument.

Args:

num_kernel_args : The number of kernel arguments for the kernel.

enqueue_kernel_and_copy_back(dim_bounds, sycl_queue_val)

Submits the kernel to the specified queue, waits and then copies back any results to the host.

Args:
dim_boundsAn array of three tuple representing the starting

offset, end offset and the stride (step) for each dimension of the input arrays. Every array in a parfor is of the same dimensionality and shape, thus ensuring the bounds are the same.

sycl_queue_valThe SYCL queue on which the kernel is

submitted.

free_queue(sycl_queue_val)

Frees the DPCTLSyclQueueRef pointer that was used to launch the kernels.

Args:

sycl_queue_val: The SYCL queue pointer to be freed.

get_current_queue()

Allocates memory on the stack to store the current queue from dpctl.

A SYCL queue is needed to allocate USM memory and submit a kernel. This function gets the queue returned by DPCTLQueueMgr_GetCurrentQueue function and stores it on the stack. The queue should be freed properly after returning from the kernel.

Return: A LLVM Value storing the pointer to the SYCL queue returned

by DPCTLQueueMgr_GetCurrentQueue.

process_kernel_arg(var, llvm_arg, arg_type, index, modified_arrays, sycl_queue_val)

process_kernel_arg(var, llvm_arg, arg_type, index, modified_arrays, sycl_queue_val) Creates an LLVM Value for each kernel argument.

Args:

var : A kernel argument represented as a Numba type. llvm_arg : Only used for array arguments and points to the LLVM

value previously allocated to store the array arg.

arg_type : The Numba type for the argument. index : The poisition of the argument in the list of arguments. modified_arrays : The list of array arguments that are written to

inside the kernel. The list is used to check if the argument is read-only or not.

Raises:
NotImplementedError: If an unsupported type of kernel argument is

encountered.