DPNP integration
Currently DPNP backend library is used.
Integration with DPNP backend library
NumPy function calls are replaced with DPNP function calls.
import numpy as np
from numba import njit
import dpctl
@njit
def foo(a):
return np.sum(a) # this call will be replaced with DPNP function
a = np.arange(42)
with dpctl.device_context():
result = foo(a)
print(result)
np.sum(a)
will be replaced with dpnp_sum_c<int, int>(…).
Repository map
Code for integration is mostly resides in
numba_dpex/dpnp_iface
.Tests resides in
numba_dpex/tests/njit_tests/dpnp
.Helper pass resides in
numba_dpex/rename_numpy_functions_pass.py
.
Architecture
Default Numba compiler pipeline is modified and extended with
DPPYRewriteOverloadedNumPyFunctions
pass.
The main work is performed in RewriteNumPyOverloadedFunctions
used by the pass.
It rewrites call for NumPy function in following way:
np.sum(a)
->numba_dpex.dpnp.sum(a)
numba_dpex.dpnp
contains stub functions (defined as classes) like following:
# numba_dpex/dpnp_iface/stubs.py - imported in numba_dpex.__init__.py
class dpnp(Stub):
class sum(Stub): # stub function
pass
For the stub function call to be lowered with Numba compiler pipeline there
is overload in numba_dpex/dpnp_iface/dpnp_transcendentalsimpl.py
:
@overload(stubs.dpnp.sum)
def dpnp_sum_impl(a):
...
Overload implementation knows about DPNP functions.
It receives DPNP function pointer from DPNP and uses known signature from DPNP headers.
The implementation calls DPNP function via creating Numba ExternalFunctionPointer
.
For more details about overloads implementation see Writing overload for stub function.
For more details about testing the integration see Writing DPNP integration tests.
Places to update
numba_dpex/dpnp_iface/stubs.py
: Add new class tostubs.dpnp
class.numba_dpex/dpnp_iface/dpnp_fptr_interface.pyx
: Update items inDPNPFuncName
enum.numba_dpex/dpnp_iface/dpnp_fptr_interface.pyx
: Update if statements inget_DPNPFuncName_from_str()
function.Add
@overload(stubs.dpnp.YOUR_FUNCTION)
in one of thenumba_dpex/dpnp_iface/*.py
modules or create new.numba_dpex/rename_numpy_functions_pass.py
: Update items inrewrite_function_name_map
dict.numba_dpex/rename_numpy_functions_pass.py
: Update imported modules inDPPYRewriteOverloadedNumPyFunctions.__init__()
.Add test in one of the
numba_dpex/tests/njit_tests/dpnp
test modules or create new.
Writing overload for stub function
Overloads for stub functions resides in numba_dpex/dpnp_iface/*.py
modules.
If you need create new module try to name it corresponding to DPNP naming.
I.e. dpnp/backend/kernels/dpnp_krnl_indexing.cpp
-> numba_dpex/dpnp_iface/dpnp_indexing.py
.
from numba.core.extending import overload
import numba_dpex.dpnp_iface as dpnp_lowering
...
@overload(stubs.dpnp.sum)
def dpnp_sum_impl(a):
dpnp_lowering.ensure_dpnp("sum")
ensure_dpnp()
checks that DPNP package is available and contains the function.
from numba import types
from numba.core.typing import signature
...
# continue of dpnp_sum_impl()
"""
dpnp source:
https://github.com/IntelPython/dpnp/blob/0.6.1dev/dpnp/backend/kernels/dpnp_krnl_reduction.cpp#L59
Function declaration:
void dpnp_sum_c(void* result_out,
const void* input_in,
const size_t* input_shape,
const size_t input_shape_ndim,
const long* axes,
const size_t axes_ndim,
const void* initial,
const long* where)
"""
sig = signature(
types.void, # return type
types.voidptr, # void* result_out,
types.voidptr, # const void* input_in,
types.voidptr, # const size_t* input_shape,
types.intp, # const size_t input_shape_ndim,
types.voidptr, # const long* axes,
types.intp, # const size_t axes_ndim,
types.voidptr, # const void* initial,
types.voidptr, # const long* where)
)
Signature sig
is based on the DPNP function signature defined in header file.
It is recommended to provide link to signature in DPNP sources and copy it in comment
as shown above.
For mapping between C types and Numba types see Types matching for Numba and DPNP.
import numba_dpex.dpnp_iface.dpnpimpl as dpnp_ext
...
# continue of dpnp_sum_impl()
dpnp_func = dpnp_ext.dpnp_func("dpnp_sum", [a.dtype.name, "NONE"], sig)
dpnp_ext.dpnp_func()
returns function pointer from DPNP.
It receives:
Function name (i.e.
"dpnp_sum"
) which is converted toDPNPFuncName
enum inget_DPNPFuncName_from_str()
.List of input and output data types names (i.e.
[a.dtype.name, "NONE"]
,"NONE"
means reusing previous type name) which is converted toDPNPFuncType
enum inget_DPNPFuncType_from_str()
.Signature which is used for creating Numba
ExternalFunctionPointer
.
import numba_dpex.dpnp_iface.dpnpimpl as dpnp_ext
...
# continue of dpnp_sum_impl()
PRINT_DEBUG = dpnp_lowering.DEBUG
def dpnp_impl(a):
out = np.empty(1, dtype=a.dtype)
common_impl(a, out, dpnp_func, PRINT_DEBUG)
return out[0]
return dpnp_impl
This code created implementation function and returns it from the overload function.
PRINT_DEBUG
used for printing debug information which is used in tests.
Tests rely on debug information to check that DPNP implementation was used.
See Writing DPNP integration tests.
dpnp_impl()
creates output array with size and data type corresponding
to DPNP function output array.
dpnp_impl()
could call NumPy functions supported by Numba and
other stab functions (i.e. numba_dpex.dpnp.dot()
).
The implementation function usually reuse a common function like common_impl()
.
This approach eliminates code duplication.
You should consider all available common functions at the top of the file before
creating the new one.
from numba.core.extending import register_jitable
from numba_dpex import dpctl_functions
import numba_dpex.dpnp_iface.dpnpimpl as dpnp_ext
...
@register_jitable
def common_impl(a, out, dpnp_func, print_debug):
if a.size == 0:
raise ValueError("Passed Empty array")
sycl_queue = dpctl_functions.get_current_queue()
a_usm = dpctl_functions.malloc_shared(a.size * a.itemsize, sycl_queue) # 1
dpctl_functions.queue_memcpy(sycl_queue, a_usm, a.ctypes, a.size * a.itemsize) # 2
out_usm = dpctl_functions.malloc_shared(a.itemsize, sycl_queue) # 1
axes, axes_ndim = 0, 0
initial = 0
where = 0
dpnp_func(out_usm, a_usm, a.shapeptr, a.ndim, axes, axes_ndim, initial, where) # 3
dpctl_functions.queue_memcpy(
sycl_queue, out.ctypes, out_usm, out.size * out.itemsize
) # 4
dpctl_functions.free_with_queue(a_usm, sycl_queue) # 5
dpctl_functions.free_with_queue(out_usm, sycl_queue) # 5
dpnp_ext._dummy_liveness_func([a.size, out.size]) # 6
if print_debug:
print("dpnp implementation") # 7
Key parts of any common function are:
Allocate input and output USM arrays
Copy input array to input USM array
Call
dpnp_func()
Copy output USM array to output array
Deallocate USM arrays
Disable dead code elimination for input and output arrays
Print debug information used for testing
Types matching for Numba and DPNP
[const] T*
->types.voidptr
size_t ->
types.intp
long ->
types.int64
We are using void * in case of size_t * as Numba currently does not have any type to represent size_t *. Since, both the types are pointers, if the compiler allows there should not be any mismatch in the size of the container to hold different types of pointer.
Writing DPNP integration tests
See all DPNP integration tests in numba_dpex/tests/njit_tests/dpnp
.
Usually adding new test is as easy as adding function name to the corresponding list of function names.
Each item in the list is used as a parameter for tests.
You should find tests for the category of functions similar to your function and
update a list with function names like list_of_unary_ops
, list_of_nan_ops
.
@pytest.mark.parametrize("filter_str", filter_strings)
def test_unary_ops(filter_str, unary_op, input_array, get_shape, capfd):
a = input_array # 1
a = np.reshape(a, get_shape)
op, name = unary_op # 2
if (name == "cumprod" or name == "cumsum") and (
filter_str == "opencl:cpu:0" or is_gen12(filter_str)
):
pytest.skip()
actual = np.empty(shape=a.shape, dtype=a.dtype)
expected = np.empty(shape=a.shape, dtype=a.dtype)
f = njit(op) # 3
with dpctl.device_context(filter_str), dpnp_debug(): # 7
actual = f(a) # 4
captured = capfd.readouterr()
assert "dpnp implementation" in captured.out # 8
expected = op(a) # 5
max_abs_err = np.sum(actual - expected)
assert max_abs_err < 1e-4 # 6
Test functions starts from test_
(see pytest docs) and
all input parameters are provided by fixtures.
In example above unary_op
contains tuple (FUNCTION, FUNCTION_NAME)
,
see fixture unary_op()
.
Key parts of any test are:
Receive input array from the fixture
input_array
Receive the tested function from fixture
unary_op
Compile the tested function with
njit()
Call the compiled tested function inside
device_context()
device_context and receiveactual
resultCall the original tested function and receive
expected
resultCompare
actual
andexpected
resultRun the compiled test function inside debug contex
dpnp_debug()
Check that DPNP was usede as debug information was printed to output
Troubleshooting
Do not forget to rebuild Python extensions with current installed version of DPNP. There is headers dependency in Cython files (i.e.
numba_dpex/dpnp_iface/dpnp_fptr_interface.pyx
).Do not forget add array to
dpnp_ext._dummy_liveness_func([YOUR_ARRAY.size])
. Dead code elimination could delete temporary variables before they are used for DPNP function call. As a result wrong data could be passed to DPNP function.