`dpctl.utils`¶

dpctl.utils.get_execution_queue(qs, /)¶

Get execution queue from queues associated with input arrays.

Parameters:: qs (List[dpctl.SyclQueue], Tuple[dpctl.SyclQueue]) – a list or a tuple of dpctl.SyclQueue objects corresponding to arrays that are being combined.
Returns:: execution queue under compute follows data paradigm, or None if queues are not equal.
Return type:: SyclQueue

dpctl.utils.get_coerced_usm_type(usm_types, /)¶

Get USM type of the output array for a function combining arrays of given USM types using compute-follows-data execution model.

Parameters:

usm_types (List[str], Tuple[str]) – a list or a tuple of strings of .usm_types attributes for input arrays

Returns:

str: type of USM allocation for the output arrays (s). None if any of the input strings are not recognized.

dpctl.utils.validate_usm_type(usm_type, allow_none=True)¶

Raises an exception if usm_type is invalid.

Parameters:

usm_type –
Specification for USM allocation type. Valid specifications are:
- "device"
- "shared"
- "host"
If allow_none keyword argument is set, a value of None is also permitted.
allow_none (bool, optional) – Whether usm_type value of None is considered valid. Default: True.

Raises:

ValueError – if usm_type is not a recognized string.
TypeError – if usm_type is not a string, and usm_type is not None provided allow_none is True.

dpctl.utils.onetrace_enabled()[source]¶

Enable onetrace collection for kernels executed in this context.

Note

Proper working of this utility assumes that Python interpreter has been launched by onetrace or unitrace tool from project intel/pti-gpu.

Example:

Launch the Python interpreter using onetrace tool:

$ onetrace --conditional-collection -v -t --demangle python app.py

Now using the context manager in the Python sessions enables data collection and its output for every offloaded kernel:

import dpctl.tensor as dpt
from dpctl.utils import onetrace_enabled

# onetrace output reporting on execution of the kernel
# should be seen, starting with "Device Timeline"
with onetrace_enabled():
    a = dpt.arange(100, dtype='int16')

Sample output:

>>> with onetrace_enabled():
...     a = dpt.arange(100, dtype='int16')
...
Device Timeline (queue: 0x555aee86bed0): dpctl::tensor::kernels::constructors::linear_sequence_step_kernel<short>[SIMD32 {1; 1; 1} {100; 1; 1}]<1.1> [ns] = 44034325658 (append) 44034816544 (submit) 44036296752 (start) 44036305918 (end)
>>>

dpctl.utils.intel_device_info(sycl_device)[source]¶

For Intel(R) GPU devices returns a dictionary with device architectural details, and an empty dictionary otherwise. The dictionary contains the following keys:

device_id:: 32-bits device PCI identifier
gpu_eu_count:: Total number of execution units
gpu_hw_threads_per_eu:: Number of thread contexts in EU
gpu_eu_simd_width:: Physical SIMD width of EU
gpu_slices:: Total number of slices
gpu_subslices_per_slice:: Number of sub-slices per slice
gpu_eu_count_per_subslice:: Number of EUs in subslice
max_mem_bandwidth:: Maximum memory bandwidth in bytes/second
free_memory:: Global memory available on the device in units of bytes

Unsupported descriptors are omitted from the dictionary.

Descriptors other than the PCI identifier are supported only for SyclDevices with Level-Zero backend.

Note

Environment variable ZES_ENABLE_SYSMAN may need to be set to 1 for the "free_memory" key to be reported.

exception dpctl.utils.ExecutionPlacementError¶

Exception raised when execution placement target can not be unambiguously determined from input arrays.

Make sure that input arrays are associated with the same dpctl.SyclQueue, or migrate data to the same dpctl.SyclQueue using dpctl.tensor.usm_ndarray.to_device() method.

dpctl.utils¶

`dpctl.utils`¶