What is oneAPI
oneAPI is an open standard for a unified application programming interface (API) that delivers a common developer experience across accelerator architectures, including multi-core CPUs, GPUs, and FPGAs.
Toolkits
A freely available implementation of the standard is available through Intel(R) oneAPI Toolkits. The Intel(R) Base Toolkit features an industry-leading C++ compiler that implements SYCL*, an evolution of C++ for heterogeneous computing. It also includes a suite of performance libraries, such as Intel(R) oneAPI Math Kernel Library (oneMKL), etc, as well as Intel(R) Distribution for Python*.
DPC++ is a LLVM-based compiler project that implements compiler and runtime support for SYCL* language.
It is being developed in sycl
branch of the LLVM project fork github.com/intel/llvm.
The project publishes daily builds for Linux.
Intel(R) oneAPI DPC++ compiler is a proprietary product that builds on the open-source DPC++ project. It is part of Intel(R) compiler suite which has completed the adoption of LLVM infrastructure and is available in oneAPI toolkits. In particular, Intel(R) Fortran compiler is freely avialable on all supported platforms in Intel(R) oneAPI HPC Toolkit.
DPC++ leverages standard toolchain runtime libraries, such as glibc
and libstdc++
on Linux and wincrt
on Windows. This makes it possible to use
Intel C/C++ compilers, including DPC++, to compile Python native extensions compatible with the CPython and the rest of Python stack.
In order to enable cross-architecture programming for CPUs and accelerators the DPC++ runtime adopted layered architecture. Software concepts are mapped to hardware abstraction layer by user-specified SYCL backend which programs the specific hardware in use.
Compute runtime
An integral part of this layered architecture is provided by Intel(R) Compute Runtime. oneAPI application is a fat binary consisting of device codes in a standardized intermediate form SPIR-V and host code which orchestrates tasks such as querying of the heterogeneous system it is running on, selecting accelerator(s), compiling (jitting) device code in the intermediate representation for the selected device, managing device memory, and submitting compiled device code for execution. The host code performs these tasks by using DPC++ runtime, which maps them to hardware abstraction layer, that talks to hardware-specific drivers.
Additional information
Data Parallel C++ book is an excellent resource to get familiar with programming heterogeneous systems using C++ and SYCL*.
Intel(R) DevCloud hosts base training material which can be executed on the variety of Intel(R) hardware using preinstalled oneAPI toolkits.
Julia has support for oneAPI github.com/JuliaGPU/oneAPI.jl.