DLPack exchange of USM allocated arrays¶
DLPack overview¶
DLPack is a commonly used C-ABI compatible data structure that allows data exchange between major frameworks. DLPack strives to be minimal, intentionally leaves allocators API and device API out of scope.
Data shared via DLPack are owned by the producer who provides a deleter function stored in the DLManagedTensor, and are only accessed by consumer. Python semantics of using the structure is explained in dlpack docs.
DLPack specifies data location in memory via void * data field of DLTensor struct, and via DLDevice device
field.
The DLDevice struct has two members: an enumeration device_type
and an integer device_id
.
DLPack recognizes enumeration value DLDeviceType::kDLOneAPI
reserved for sharing SYCL USM allocations.
It is not kDLSycl
since importing USM-allocated tensor with this device type relies on oneAPI SYCL extensions
sycl_ext_oneapi_filter_selector
and sycl_ext_oneapi_default_platform_context
to operate.
Exporting USM allocation to DLPack¶
When sharing USM allocation (of any sycl::usm::kind
) with void * ptr
bound to sycl::context ctx
:
// Input: void *ptr:
// USM allocation pointer
// sycl::context ctx:
// context the pointer is bound to
// Get device where allocation was originally made
// Keep in mind, the device may be a sub-device
const sycl::device &ptr_dev = sycl::get_pointer_device(ptr, ctx);
#if SYCL_EXT_ONEAPI_DEFAULT_CONTEXT
const sycl::context &default_ctx = ptr_dev.get_platform().ext_oneapi_get_default_context();
#else
static_assert(false, "ext_oneapi_default_context extension is required");
#endif
// Assert that ctx is the default platform context, or throw
if (ctx != default_ctx) {
throw pybind11::type_error(
"Can not export USM allocations not "
"bound to default platform context."
);
}
// Find parent root device if ptr_dev is a sub-device
const sycl::device &parent_root_device = get_parent_root_device(ptr_dev);
// find position of parent_root_device in sycl::get_devices
const auto &all_root_devs = sycl::device::get_devices();
auto beg = std::begin(all_root_devs);
auto end = std::end(all_root_devs);
auto selectot_fn = [parent_root_device](const sycl::device &root_d) -> bool {
return parent_root_device == root_d;
};
auto pos = find_if(beg, end, selector_fn);
if (pos == end) {
throw pybind11::type_error("Could not produce DLPack: failed finding device_id");
}
std::ptrdiff_t dev_idx = std::distance(beg, pos);
// check that dev_idx can fit into int32_t if needed
int32_t device_id = static_cast<int32_t>(dev_idx);
// populate DLTensor with DLDeviceType::kDLOneAPI and computed device_id
Importing DLPack with device_type == kDLOneAPI
¶
// Input: ptr = dlm_tensor->dl_tensor.data
// device_id = dlm_tensor->dl_tensor.device.device_id
// Get root_device from device_id
const auto &device_vector = sycl::get_device();
const sycl::device &root_device = device_vector.at(device_id);
// Check if the backend of the device is supported by consumer
// Perhaps for certain backends (CUDA, hip, etc.) we should dispatch
// different dlpack importers
// alternatively
// sycl::device root_device = sycl::device(
// sycl::ext::oneapi::filter_selector{ std::to_string(device_id)}
// );
// Get default platform context
#if SYCL_EXT_ONEAPI_DEFAULT_CONTEXT
const sycl::context &default_ctx = root_device.get_platform().ext_oneapi_get_default_context();
#else
static_assert(false, "ext_oneapi_default_context extension is required");
#endif
// Check that pointer is known in the context
const sycl::usm::kind &alloc_type = sycl::get_pointer_type(ptr, ctx);
if (alloc_type == sycl::usm::kind::unknown) {
throw pybind11::type_error(
"Data pointer in DLPack is not bound to the "
"default platform context of specified device"
);
}
// Perform check that USM allocation type is supported by consumer if needed
// Get sycl::device where the data was allocated
const sycl::device &ptr_dev = sycl::get_pointer_device(ptr, ctx);
// Create object of consumer's library from ptr, ptr_dev, ctx
Support of DLPack with kDLOneAPI
device type¶
dpctl
supports DLPack v0.8. Exchange of USM allocations made using Level-Zero backend
is supported with torch.Tensor(device='xpu')
for PyTorch when using intel-extension-for-pytorch,
as well as for TensorFlow when intel-extension-for-tensorflow is used.