dpctl.memory¶
Data Parallel Control Memory
dpctl.memory provides Python objects for untyped USM memory container of bytes for each kind of USM pointers: shared pointers, device pointers and host pointers.
Shared and host pointers are accessible from both host and a device, while device pointers are only accessible from device.
Python objects corresponding to shared and host pointers implement Python simple buffer protocol. It is therefore possible to use these objects to maniputalate USM memory using NumPy or bytearray, memoryview, or array.array classes.
Classes¶
-
class
dpctl.memory.
MemoryUSMDevice
¶ MemoryUSMDevice(nbytes, alignment=0, queue=None, copy=False) allocates nbytes of USM device memory.
Non-positive alignments are not used (malloc_device is used instead). The queue=None the current dpctl.get_current_queue() is used to allocate memory.
MemoryUSMDevice(usm_obj) constructor create instance from usm_obj expected to implement __sycl_usm_array_interface__ protocol and exposing a contiguous block of USM memory of USM device type. Using copy=True to perform a copy if USM type is other than ‘device’.
-
copy_from_device
()¶ Copy SYCL memory underlying the argument object into the memory of the instance
-
copy_from_host
()¶ Copy content of Python buffer provided by obj to instance memory.
-
copy_to_host
()¶ Copy content of instance’s memory into memory of obj, or allocate NumPy array of obj is None
-
get_usm_type
()¶
-
nbytes
¶
-
reference_obj
¶
-
size
¶
-
tobytes
()¶ Constructs bytes object populated with copy of USM memory
-
-
class
dpctl.memory.
MemoryUSMHost
¶ MemoryUSMHost(nbytes, alignment=0, queue=None, copy=False) allocates nbytes of USM host memory.
Non-positive alignments are not used (malloc_host is used instead). The queue=None the current dpctl.get_current_queue() is used to allocate memory.
MemoryUSMDevice(usm_obj) constructor create instance from usm_obj expected to implement __sycl_usm_array_interface__ protocol and exposing a contiguous block of USM memory of USM host type. Using copy=True to perform a copy if USM type is other than ‘host’.
-
copy_from_device
()¶ Copy SYCL memory underlying the argument object into the memory of the instance
-
copy_from_host
()¶ Copy content of Python buffer provided by obj to instance memory.
-
copy_to_host
()¶ Copy content of instance’s memory into memory of obj, or allocate NumPy array of obj is None
-
get_usm_type
()¶
-
nbytes
¶
-
reference_obj
¶
-
size
¶
-
tobytes
()¶ Constructs bytes object populated with copy of USM memory
-
MemoryUSMShared(nbytes, alignment=0, queue=None, copy=False) allocates nbytes of USM shared memory.
Non-positive alignments are not used (malloc_shared is used instead). The queue=None the current dpctl.get_current_queue() is used to allocate memory.
MemoryUSMShared(usm_obj) constructor create instance from usm_obj expected to implement __sycl_usm_array_interface__ protocol and exposing a contiguous block of USM memory of USM shared type. Using copy=True to perform a copy if USM type is other than ‘shared’.
Copy SYCL memory underlying the argument object into the memory of the instance
Copy content of Python buffer provided by obj to instance memory.
Copy content of instance’s memory into memory of obj, or allocate NumPy array of obj is None
Constructs bytes object populated with copy of USM memory
Comparison with Rapids Memory Manager (RMM)¶
RMM implements DeviceBuffer which is Cython native class wrapping around something similar to std::vector<unsigned char, custom_cuda_allocator (calls resource manager)>
which is called device_buffer.
DeviceBuffer stores a unique pointer to an instance of this class. DeviceBuffer implements __cuda_array_interface__
. Direct constructors always allocate
new memory and copy provided inputs into the newly allocated array.
Zero-copy construction is possible from a unique_ptr<device_ buffer>
, with
the ownership being moved to the Cython extension instance.
DeviceBuffer provides __reduce__
method to support pickling (which works by copying content of the device buffer to host) and provides the following set of routines, among others:
copy_to_host(host_buf_obj)
to copy content of the underlying device_buffer to a host buffer
copy_from_host(host_buf_obf)
to copy content of the host buffer into memory of underlying device_buffer
copy_from_device(cuda_ary_obj)
to copy device memory underlying cuda_ary_obj Python object implementing__cuda_array_interface__
to the memory underlying DeviceBuffer instance.
RMM’s methods are declared nogil.