dpctl.utils

dpctl.utils.get_execution_queue(qs, /)

Get execution queue from queues associated with input arrays.

Parameters:

qs (List[dpctl.SyclQueue], Tuple[dpctl.SyclQueue]) – a list or a tuple of dpctl.SyclQueue objects corresponding to arrays that are being combined.

Returns:

execution queue under compute follows data paradigm, or None if queues are not equal.

Return type:

SyclQueue

dpctl.utils.get_coerced_usm_type(usm_types, /)

Get USM type of the output array for a function combining arrays of given USM types using compute-follows-data execution model.

Parameters:

usm_types (List[str], Tuple[str]) – a list or a tuple of strings of .usm_types attributes for input arrays

Returns:

str

type of USM allocation for the output arrays (s). None if any of the input strings are not recognized.

dpctl.utils.validate_usm_type(usm_type, allow_none=True)

Raises an exception if usm_type is invalid.

Parameters:
  • usm_type

    Specification for USM allocation type. Valid specifications are:

    • "device"

    • "shared"

    • "host"

    If allow_none keyword argument is set, a value of None is also permitted.

  • allow_none (bool, optional) – Whether usm_type value of None is considered valid. Default: True.

Raises:
  • ValueError – if usm_type is not a recognized string.

  • TypeError – if usm_type is not a string, and usm_type is not None provided allow_none is True.

dpctl.utils.onetrace_enabled()[source]

Enable onetrace collection for kernels executed in this context.

Note

Proper working of this utility assumes that Python interpreter has been launched by onetrace or unitrace tool from project intel/pti-gpu.

Example:

Launch the Python interpreter using onetrace tool:

$ onetrace --conditional-collection -v -t --demangle python app.py

Now using the context manager in the Python sessions enables data collection and its output for every offloaded kernel:

import dpctl.tensor as dpt
from dpctl.utils import onetrace_enabled

# onetrace output reporting on execution of the kernel
# should be seen, starting with "Device Timeline"
with onetrace_enabled():
    a = dpt.arange(100, dtype='int16')

Sample output:

>>> with onetrace_enabled():
...     a = dpt.arange(100, dtype='int16')
...
Device Timeline (queue: 0x555aee86bed0): dpctl::tensor::kernels::constructors::linear_sequence_step_kernel<short>[SIMD32 {1; 1; 1} {100; 1; 1}]<1.1> [ns] = 44034325658 (append) 44034816544 (submit) 44036296752 (start) 44036305918 (end)
>>>
dpctl.utils.intel_device_info(sycl_device)[source]

For Intel(R) GPU devices returns a dictionary with device architectural details, and an empty dictionary otherwise. The dictionary contains the following keys:

device_id:

32-bits device PCI identifier

gpu_eu_count:

Total number of execution units

gpu_hw_threads_per_eu:

Number of thread contexts in EU

gpu_eu_simd_width:

Physical SIMD width of EU

gpu_slices:

Total number of slices

gpu_subslices_per_slice:

Number of sub-slices per slice

gpu_eu_count_per_subslice:

Number of EUs in subslice

max_mem_bandwidth:

Maximum memory bandwidth in bytes/second

free_memory:

Global memory available on the device in units of bytes

Unsupported descriptors are omitted from the dictionary.

Descriptors other than the PCI identifier are supported only for SyclDevices with Level-Zero backend.

Note

Environment variable ZES_ENABLE_SYSMAN may need to be set to 1 for the "free_memory" key to be reported.

exception dpctl.utils.ExecutionPlacementError

Exception raised when execution placement target can not be unambiguously determined from input arrays.

Make sure that input arrays are associated with the same dpctl.SyclQueue, or migrate data to the same dpctl.SyclQueue using dpctl.tensor.usm_ndarray.to_device() method.