Welcome to numba-dpex’s documentation!

numba-dpex is an Intel ®-developed extension to the Numba JIT compiler that adds “XPU” programming capabilities to it. The XPU vision is to make it extremely easy for programmers to write efficient and portable code for a mix of architectures across CPUs, GPUs, FPGAs and other accelerators. To provide XPU programming capabilities, the extension relies on SYCL that is an industry standard for writing cross-platform code using standard C++. Using a SYCL runtime library the extension can launch data-parallel kernels generated directly from Python bytecode on supported data-parallel architectures. Currently, support for SYCL is restricted to Intel’s DPC++ via the dpctl package. Support for other SYCL runtime libraries may be added in the future.

The main feature of the extension is to let programmers write data-parallel kernels directly in Python. Such kernels can be written in two different ways: an explicit API superficially similar to OpenCL, and an implicit API that generates kernels from NumPy library calls, Numba’s prange statement, and other “data-parallel by construction” expressions that Numba is able to parallelize. Following are two examples to demonstrate the two ways in which kernels may be written using the extension.

  • Defining a data-parallel kernel explicitly.

    import numpy as np
    import numba_dpex as dppy
    import dpctl
    
    @dppy.kernel
    def sum(a, b, c):
        i = dppy.get_global_id(0)
        c[i] = a[i] + b[i]
    
    a = np.array(np.random.random(20), dtype=np.float32)
    b = np.array(np.random.random(20), dtype=np.float32)
    c = np.ones_like(a)
    
    with dpctl.device_context("opencl:gpu"):
        sum[20, dppy.DEFAULT_LOCAL_SIZE](a, b, c)
    
  • Writing implicitly data-parallel expressions in the fashion of Numba parallel loops.

    from numba import njit
    import numpy as np
    import dpctl
    
    @njit
    def f1(a, b):
        c = a + b
        return c
    
    global_size = 64
    local_size = 32
    N = global_size * local_size
    a = np.ones(N, dtype=np.float32)
    b = np.ones(N, dtype=np.float32)
    with dpctl.device_context("opencl:gpu:0"):
        c = f1(a, b)
    

About

numba-dpex is developed by Intel and is part of the Intel Distribution for Python.

Contributing

Refer the contributing guide for information on coding style and standards used in the project.

License

The code is Licensed under Apache License 2.0 that can be found in LICENSE. All usage and contributions to the project are subject to the terms and conditions of this license.

Indices and tables