Supported Address Space Qualifiers#

The address space qualifier may be used to specify the region of memory that is used to allocate the object.

Numba-dpex supports three disjoint named address spaces:

  1. Global Address Space

    Global Address Space refers to memory objects allocated from the global memory pool and will be shared among all work-items. Arguments passed to any kernel are allocated in the global address space. In the below example, arguments a, b and c will be allocated in the global address space:

    # SPDX-FileCopyrightText: 2020 - 2023 Intel Corporation
    #
    # SPDX-License-Identifier: Apache-2.0
    
    import dpnp
    import numpy.testing as testing
    
    import numba_dpex as ndpx
    
    
    # Data parallel kernel implementing vector sum
    @ndpx.kernel
    def kernel_vector_sum(a, b, c):
        i = ndpx.get_global_id(0)
        c[i] = a[i] + b[i]
    
    
    # Utility function for printing and testing
    def driver(a, b, c, global_size):
        kernel_vector_sum[ndpx.Range(global_size)](a, b, c)
        a_np = dpnp.asnumpy(a)  # Copy dpnp array a to NumPy array a_np
        b_np = dpnp.asnumpy(b)  # Copy dpnp array b to NumPy array b_np
        c_np = dpnp.asnumpy(c)  # Copy dpnp array c to NumPy array c_np
        testing.assert_equal(c_np, a_np + b_np)
    
    
    # Main function
    def main():
        N = 10
        global_size = N
        print("Vector size N", N)
    
        # Create random vectors on the default device
        a = dpnp.random.random(N)
        b = dpnp.random.random(N)
        c = dpnp.ones_like(a)
    
        print("Using device ...")
        print(a.device)
        driver(a, b, c, global_size)
        print("Done...")
    
    
    if __name__ == "__main__":
        main()
    
  2. Local Address Space

    Local Address Space refers to memory objects that need to be allocated in local memory pool and are shared by all work-items of a work-group. Numba-dpex does not support passing arguments that are allocated in the local address space to @numba_dpex.kernel. Users are allowed to allocate static arrays in the local address space inside the @numba_dpex.kernel. In the example below numba_dpex.local.array(shape, dtype) is the API used to allocate a static array in the local address space:

  3. Private Address Space

    Private Address Space refers to memory objects that are local to each work-item and is not shared with any other work-item. In the example below numba_dpex.private.array(shape, dtype) is the API used to allocate a static array in the private address space:

    # SPDX-FileCopyrightText: 2020 - 2023 Intel Corporation
    #
    # SPDX-License-Identifier: Apache-2.0
    
    import dpctl
    import dpctl.tensor as dpt
    import numpy as np
    from numba import float32
    
    import numba_dpex as ndpx
    
    
    def private_memory():
        """
        This example demonstrates the usage of numba_dpex's `private.array`
        intrinsic function. The function is used to create a static array
        allocated on the devices private address space.
        """
    
        @ndpx.kernel
        def private_memory_kernel(A):
            memory = ndpx.private.array(shape=1, dtype=np.float32)
            i = ndpx.get_global_id(0)
    
            # preload
            memory[0] = i
            ndpx.barrier(ndpx.LOCAL_MEM_FENCE)  # local mem fence
    
            # memory will not hold correct deterministic result if it is not
            # private to each thread.
            A[i] = memory[0] * 2
    
        N = 4
        device = dpctl.select_default_device()
    
        arr = dpt.zeros(N, dtype=dpt.float32, device=device)
        orig = np.arange(N).astype(np.float32)
    
        print("Using device ...")
        device.print_device_info()
    
        global_range = ndpx.Range(N)
        local_range = ndpx.Range(N)
        private_memory_kernel[ndpx.NdRange(global_range, local_range)](arr)
    
        arr_out = dpt.asnumpy(arr)
        np.testing.assert_allclose(orig * 2, arr_out)
        # the output should be `orig[i] * 2, i.e. [0, 2, 4, ..]``
        print(arr_out)
    
    
    def main():
        private_memory()
    
        print("Done...")
    
    
    if __name__ == "__main__":
        main()