numba_dpex.core.utils.kernel_launcher module

class numba_dpex.core.utils.kernel_launcher.KernelLaunchIRBuilder(lowerer, cres)

Bases: object

Helper class to build the LLVM IR for the submission of a kernel.

The class generates LLVM IR inside the current LLVM module that is needed for submitting kernels. The LLVM Values that

allocate_kernel_arg_array(num_kernel_args)

Allocates an array to store the LLVM Value for every kernel argument.

Args:

num_kernel_args (int): The number of kernel arguments that determines the size of args array to allocate.

Returns: An LLVM IR value pointing to an array to store the kernel arguments.

allocate_kernel_arg_ty_array(num_kernel_args)

Allocates an array to store the LLVM Value for the typenum for every kernel argument.

Args:

num_kernel_args (int): The number of kernel arguments that determines the size of args array to allocate.

Returns: An LLVM IR value pointing to an array to store the kernel arguments typenums as defined in dpctl.

build_arg(val, ty, arg_list, args_ty_list, arg_num)

Stores the kernel arguments and the kernel argument types into arrays that will be passed to DPCTLQueue_SubmitRange.

Args:

val: An LLVM IR Value that will be stored into the arguments array ty: A Numba type that will be converted to a DPCTLKernelArgType enum and stored into the argument types list array arg_list: An LLVM IR Value array that stores the kernel arguments args_ty_list: An LLVM IR Value array that stores the DPCTLKernelArgType enum value for each kernel argument arg_num: The index position at which the arg_list and args_ty_list need to be updated.

build_array_arg(array_val, array_rank, arg_list, args_ty_list, arg_num)

Creates a list of LLVM Values for an unpacked DpnpNdArray kernel argument.

The steps performed here are the same as in numba_dpex.core.kernel_interface.arg_pack_unpacker._unpack_array_helper

build_complex_arg(val, ty, arg_list, args_ty_list, arg_num)

Creates a list of LLVM Values for an unpacked complex kernel argument.

free_queue(sycl_queue_val)

Frees the DPCTLSyclQueueRef pointer that was used to launch the kernels.

Args:

sycl_queue_val: The SYCL queue pointer to be freed.

get_queue(exec_queue)

Allocates memory on the stack to store a DPCTLSyclQueueRef.

Returns: A LLVM Value storing the pointer to the SYCL queue created using the filter string for the Python exec_queue (dpctl.SyclQueue).

submit_sync_kernel(sycl_queue_val, total_kernel_args, arg_list, arg_ty_list, global_range, local_range=None)

Submits the kernel to the specified queue, waits.