numba_dpex.core.utils.kernel_launcher module
- class numba_dpex.core.utils.kernel_launcher.KernelLaunchIRBuilder(lowerer, cres)
Bases:
objectHelper class to build the LLVM IR for the submission of a kernel.
The class generates LLVM IR inside the current LLVM module that is needed for submitting kernels. The LLVM Values that
- allocate_kernel_arg_array(num_kernel_args)
Allocates an array to store the LLVM Value for every kernel argument.
- Args:
num_kernel_args (int): The number of kernel arguments that determines the size of args array to allocate.
Returns: An LLVM IR value pointing to an array to store the kernel arguments.
- allocate_kernel_arg_ty_array(num_kernel_args)
Allocates an array to store the LLVM Value for the typenum for every kernel argument.
- Args:
num_kernel_args (int): The number of kernel arguments that determines the size of args array to allocate.
Returns: An LLVM IR value pointing to an array to store the kernel arguments typenums as defined in dpctl.
- build_arg(val, ty, arg_list, args_ty_list, arg_num)
Stores the kernel arguments and the kernel argument types into arrays that will be passed to DPCTLQueue_SubmitRange.
- Args:
val: An LLVM IR Value that will be stored into the arguments array ty: A Numba type that will be converted to a DPCTLKernelArgType enum and stored into the argument types list array arg_list: An LLVM IR Value array that stores the kernel arguments args_ty_list: An LLVM IR Value array that stores the DPCTLKernelArgType enum value for each kernel argument arg_num: The index position at which the arg_list and args_ty_list need to be updated.
- build_array_arg(array_val, array_rank, arg_list, args_ty_list, arg_num)
Creates a list of LLVM Values for an unpacked DpnpNdArray kernel argument.
The steps performed here are the same as in numba_dpex.core.kernel_interface.arg_pack_unpacker._unpack_array_helper
- build_complex_arg(val, ty, arg_list, args_ty_list, arg_num)
Creates a list of LLVM Values for an unpacked complex kernel argument.
- free_queue(sycl_queue_val)
Frees the
DPCTLSyclQueueRefpointer that was used to launch the kernels.- Args:
sycl_queue_val: The SYCL queue pointer to be freed.
- get_queue(exec_queue)
Allocates memory on the stack to store a DPCTLSyclQueueRef.
Returns: A LLVM Value storing the pointer to the SYCL queue created using the filter string for the Python exec_queue (dpctl.SyclQueue).
- submit_sync_kernel(sycl_queue_val, total_kernel_args, arg_list, arg_ty_list, global_range, local_range=None)
Submits the kernel to the specified queue, waits.