numba_dpex.dpctl_iface.kernel_launch_ops module
- class numba_dpex.dpctl_iface.kernel_launch_ops.KernelLaunchOps(lowerer, cres, num_inputs)
Bases:
object
Defines a set of functions to launch a SYCL kernel on the “current queue” as defined in the dpctl queue manager.
- allocate_kernel_arg_array(num_kernel_args)
Allocates an array to store the LLVM Value for every kernel argument.
- Args:
num_kernel_args : The number of kernel arguments for the kernel.
- enqueue_kernel_and_copy_back(dim_bounds, sycl_queue_val)
Submits the kernel to the specified queue, waits and then copies back any results to the host.
- Args:
- dim_boundsAn array of three tuple representing the starting
offset, end offset and the stride (step) for each dimension of the input arrays. Every array in a parfor is of the same dimensionality and shape, thus ensuring the bounds are the same.
- sycl_queue_valThe SYCL queue on which the kernel is
submitted.
- free_queue(sycl_queue_val)
Frees the
DPCTLSyclQueueRef
pointer that was used to launch the kernels.- Args:
sycl_queue_val: The SYCL queue pointer to be freed.
- get_current_queue()
Allocates memory on the stack to store the current queue from dpctl.
A SYCL queue is needed to allocate USM memory and submit a kernel. This function gets the queue returned by
DPCTLQueueMgr_GetCurrentQueue
function and stores it on the stack. The queue should be freed properly after returning from the kernel.- Return: A LLVM Value storing the pointer to the SYCL queue returned
by
DPCTLQueueMgr_GetCurrentQueue
.
- process_kernel_arg(var, llvm_arg, arg_type, index, modified_arrays, sycl_queue_val)
process_kernel_arg(var, llvm_arg, arg_type, index, modified_arrays, sycl_queue_val) Creates an LLVM Value for each kernel argument.
- Args:
var : A kernel argument represented as a Numba type. llvm_arg : Only used for array arguments and points to the LLVM
value previously allocated to store the array arg.
arg_type : The Numba type for the argument. index : The poisition of the argument in the list of arguments. modified_arrays : The list of array arguments that are written to
inside the kernel. The list is used to check if the argument is read-only or not.
- Raises:
- NotImplementedError: If an unsupported type of kernel argument is
encountered.