Stepping

Stepping allows you to go through the program by lines of source code or by machine instructions.

Consider the following examples.

numba_dpex/examples/debug/simple_sum.py:

 5import dpnp as np
 6
 7import numba_dpex as ndpx
 8
 9
10@ndpx.kernel(debug=True)
11def data_parallel_sum(item, a, b, c):
12    i = item.get_id(0)
13    c[i] = a[i] + b[i]  # Condition breakpoint location
14
15
16global_size = 10
17N = global_size
18
19a = np.array(np.random.random(N), dtype=np.float32)
20b = np.array(np.random.random(N), dtype=np.float32)
21c = np.ones_like(a)
22
23ndpx.call_kernel(data_parallel_sum, ndpx.Range(global_size), a, b, c)
24
25print("Done...")

Example with a nested function numba_dpex/examples/debug/simple_dpex_func.py:

 5import dpnp as np
 6
 7import numba_dpex as ndpx
 8
 9
10@ndpx.device_func(debug=True)
11def func_sum(a_in_func, b_in_func):
12    result = a_in_func + b_in_func  # breakpoint location
13    return result
14
15
16@ndpx.kernel(debug=True)
17def kernel_sum(item, a_in_kernel, b_in_kernel, c_in_kernel):
18    i = item.get_id(0)
19    c_in_kernel[i] = func_sum(a_in_kernel[i], b_in_kernel[i])
20
21
22global_size = 10
23a = np.arange(global_size, dtype=np.float32)
24b = np.arange(global_size, dtype=np.float32)
25c = np.empty_like(a)
26
27ndpx.call_kernel(kernel_sum, ndpx.Range(global_size), a, b, c)
28
29print("Done...")

step

Run the debugger and use the following commands:

$ NUMBA_OPT=0 gdb-oneapi -q python
(gdb) set breakpoint pending on
(gdb) break simple_sum.py:22
(gdb) run simple_sum.py
...
Thread 2.2 hit Breakpoint 1, with SIMD lanes [0-7], __main__::data_parallel_sum () at simple_sum.py:22
22           i = dpex.get_global_id(0)
(gdb) step
[Switching to Thread 1.1073742080 lane 0]
Thread 2.3 hit Breakpoint 1, with SIMD lanes [0-1], __main__::data_parallel_sum () at simple_sum.py:22
22          i = dpex.get_global_id(0)
(gdb) step
23          c[i] = a[i] + b[i]
(gdb) continue
...
Done...

You can use stepping to switch to a nested function. See the example below:

$ NUMBA_OPT=0 gdb-oneapi -q python
(gdb) set breakpoint pending on
(gdb) break simple_dpex_func.py:29
(gdb) run simple_dpex_func.py
...
Thread 2.2 hit Breakpoint 1, with SIMD lanes [0-7], __main__::kernel_sum () at simple_dpex_func.py:29
29          c_in_kernel[i] = func_sum(a_in_kernel[i], b_in_kernel[i])
(gdb) step
[Switching to Thread 1.1073742080 lane 0]
Thread 2.3 hit Breakpoint 1, with SIMD lanes [0-1], __main__::kernel_sum () at simple_dpex_func.py:29
29          c_in_kernel[i] = func_sum(a_in_kernel[i], b_in_kernel[i])
(gdb) step
__main__::func_sum () at simple_dpex_func.py:22
22          result = a_in_func + b_in_func
(gdb) continue
...
Done...

stepi

The command allows you to move forward by machine instructions. The example uses an additional command x/i $pc, which prints the instruction to be executed.

$ NUMBA_OPT=0 gdb-oneapi -q python
(gdb) set breakpoint pending on
(gdb) break simple_dpex_func.py:29
(gdb) run simple_dpex_func.py
...
Thread 2.2 hit Breakpoint 1, with SIMD lanes [0-7], __main__::kernel_sum () at simple_dpex_func.py:29
29          c_in_kernel[i] = func_sum(a_in_kernel[i], b_in_kernel[i])
(gdb) stepi
0x00000000fffeb630      29          c_in_kernel[i] = func_sum(a_in_kernel[i], b_in_kernel[i])
(gdb) stepi
[Switching to Thread 1.1073742080 lane 0]
Thread 2.3 hit Breakpoint 1, with SIMD lanes [0-1], __main__::kernel_sum () at simple_dpex_func.py:29
29          c_in_kernel[i] = func_sum(a_in_kernel[i], b_in_kernel[i])
(gdb) continue
...
Done...

next

The command has stepping-like behavior, but it skips nested functions.

$ NUMBA_OPT=0 gdb-oneapi -q python
(gdb) set breakpoint pending on
(gdb) break simple_dpex_func.py:29
(gdb) run simple_dpex_func.py
...
Thread 2.2 hit Breakpoint 1, with SIMD lanes [0-7], __main__::kernel_sum () at simple_dpex_func.py:29
29          c_in_kernel[i] = func_sum(a_in_kernel[i], b_in_kernel[i])
(gdb) next
[Switching to Thread 1.1073742080 lane 0]
Thread 2.3 hit Breakpoint 1, with SIMD lanes [0-1], __main__::kernel_sum () at simple_dpex_func.py:29
29          c_in_kernel[i] = func_sum(a_in_kernel[i], b_in_kernel[i])
(gdb) next
...
Done...

set scheduler-locking step

The first line of the kernel and functions is debugged twice. This happens because you are debugging a multi-threaded program, so multiple events may be received from different threads. This is the default behavior, but you can configure it for more efficient debugging. To ensure the current thread executes a single line without interference, set the scheduler-locking setting to on or step:

$ NUMBA_OPT=0 gdb-oneapi -q python
(gdb) set breakpoint pending on
(gdb) break simple_dpex_func.py:29
(gdb) run simple_dpex_func.py
...
Thread 2.2 hit Breakpoint 1, with SIMD lanes [0-7], __main__::kernel_sum () at simple_dpex_func.py:29
29          c_in_kernel[i] = func_sum(a_in_kernel[i], b_in_kernel[i])
(gdb) set scheduler-locking step
(gdb) step
__main__::func_sum () at dpex_func.py:22
22          result = a_in_func + b_in_func
(gdb) step
23          return result
(gdb) continue
...
[Switching to Thread 1.1073742080 lane 0]
Thread 2.3 hit Breakpoint 1, with SIMD lanes [0-1], __main__::kernel_sum () at simple_dpex_func.py:29
29          c_in_kernel[i] = func_sum(a_in_kernel[i], b_in_kernel[i])
(gdb) continue
...
Done...

See also: