Debugging Local Variables

Several conditions could influence debugging of local variables:

  1. Optimization Level for LLVM

  2. Local Variables Lifetime in Numba IR

Optimization Level for LLVM

Numba provides environment variable NUMBA_OPT for configuring optimization level for LLVM. The default optimization level is three. Refer Numba documentation for details. It is recommended to debug with NUMBA_OPT=0. The possible effect of various optimization levels may be as follows:

  • NUMBA_OPT=0 means “no optimization” level - all local variables are available.

  • NUMBA_OPT=1 or higher levels - some variables may be optimized out.

Example

Source code numba_dpex/examples/debug/sum_local_vars.py:

10@ndpx.kernel(debug=True)
11def data_parallel_sum(item, a, b, c):
12    i = item.get_id(0)
13    l1 = a[i] + 2.5
14    l2 = b[i] * 0.3
15    c[i] = l1 + l2

Debug session with NUMBA_OPT=0:

$ gdb-oneapi -q python
...
(gdb) set environment NUMBA_OPT 0
(gdb) break sum_local_vars.py:26
...
(gdb) run numba_dpex/examples/debug/sum_local_vars.py
...
Thread 2.1 hit Breakpoint 1, with SIMD lanes [0-7], __main__::data_parallel_sum (a=..., b=..., c=...) at sum_local_vars.py:26
26          c[i] = l1 + l2
(gdb) info locals
i = 0
l1 = 2.9795852899551392
l2 = 0.22986688613891601

It printed all local variables with their values.

Debug session with NUMBA_OPT=1:

$ gdb-oneapi -q python
...
(gdb) set environment NUMBA_OPT 1
(gdb) break sum_local_vars.py:26
...
(gdb) run numba_dpex/examples/debug/sum_local_vars.py
...
Thread 2.1 hit Breakpoint 1, with SIMD lanes [0-7], ?? () at sum_local_vars.py:26 from /tmp/kernel_11059955544143858990_e6df1e.dbgelf
26          c[i] = l1 + l2
(gdb) info locals
No locals.

It optimized out local variables i, l1 and l2 with this optimization level.

Local Variables Lifetime in Numba IR

Lifetime of Python variables are different from lifetime of variables in compiled code. Numba analyses variables lifetime and try to optimize it. The debugger can show the variable values, but they may be zeros after the variable is explicitly deleted when the scope of variable is ended.

See Numba variable policy.

Numba provides environment variable NUMBA_EXTEND_VARIABLE_LIFETIMES for extending the lifetime of variables to the end of the block in which their lifetime ends.

See Numba documentation.

Default is zero.

It is recommended to debug with NUMBA_EXTEND_VARIABLE_LIFETIMES=1.

Example 1 - Using NUMBA_EXTEND_VARIABLE_LIFETIMES

Source code numba_dpex/tests/debugging/test_info.py:

14def common_loop_body(param_a, param_b):
15    param_c = param_a + numba.float32(10)  # Set breakpoint here
16    param_d = param_b * numba.float32(0.5)
17    result = param_c + param_d
18    return result

Debug session with NUMBA_EXTEND_VARIABLE_LIFETIMES=1:

$ gdb-oneapi -q python
...
(gdb) set environment NUMBA_EXTEND_VARIABLE_LIFETIMES 1
(gdb) break side-by-side.py:28
...
(gdb) run numba_dpex/examples/debug/side-by-side.py --api=numba-dpex-kernel
...
Thread 2.1 hit Breakpoint 1, with SIMD lanes [0-7], __main__::common_loop_body (param_a=0, param_b=0) at side-by-side.py:28
28          return result
(gdb) info locals
param_c = 10
param_d = 0
result = 10

It prints values of param_c and param_d.

Debug session with NUMBA_EXTEND_VARIABLE_LIFETIMES=0:

$ gdb-oneapi -q python
...
(gdb) set environment NUMBA_EXTEND_VARIABLE_LIFETIMES 0
(gdb) break side-by-side.py:28
...
(gdb) run numba_dpex/examples/debug/side-by-side.py --api=numba-dpex-kernel
...
Thread 2.1 hit Breakpoint 1, with SIMD lanes [0-7], __main__::common_loop_body (param_a=0, param_b=0) at side-by-side.py:28
28          return result
(gdb) info locals
param_c = 0
param_d = 0
result = 10

Example 2 - Using NUMBA_DUMP_ANNOTATION

Source code numba_dpex/examples/debug/sum_local_vars.py:

10@ndpx.kernel(debug=True)
11def data_parallel_sum(item, a, b, c):
12    i = item.get_id(0)
13    l1 = a[i] + 2.5
14    l2 = b[i] * 0.3
15    c[i] = l1 + l2

Run this code with environment variable NUMBA_DUMP_ANNOTATION=1 and it will show where numba inserts del for variables.

 1-----------------------------------ANNOTATION-----------------------------------
 2# File: numba_dpex/examples/debug/sum_local_vars.py
 3# --- LINE 20 ---
 4
 5@numba_dpex.kernel(debug=True)
 6
 7# --- LINE 21 ---
 8
 9def data_parallel_sum(a, b, c):
10
11    # --- LINE 22 ---
12    # label 0
13    #   a = arg(0, name=a)  :: array(float32, 1d, C)
14    #   b = arg(1, name=b)  :: array(float32, 1d, C)
15    #   c = arg(2, name=c)  :: array(float32, 1d, C)
16    #   $2load_global.0 = global(dpex: <module 'numba_dpex' from '.../numba-dpex/numba_dpex/__init__.py'>)  :: Module(<module 'numba_dpex' from '.../numba-dpex/numba_dpex/__init__.py'>)
17    #   $4load_method.1 = getattr(value=$2load_global.0, attr=get_global_id)  :: Function(<function get_global_id at 0x7f82b8bae430>)
18    #   del $2load_global.0
19    #   $const6.2 = const(int, 0)  :: Literal[int](0)
20    #   i = call $4load_method.1($const6.2, func=$4load_method.1, args=[Var($const6.2, sum_local_vars.py:22)], kws=(), vararg=None, target=None)  :: (uint32,) -> int64
21    #   del $const6.2
22    #   del $4load_method.1
23
24    i = dpex.get_global_id(0)
25
26    # --- LINE 23 ---
27    #   $16binary_subscr.6 = getitem(value=a, index=i, fn=<built-in function getitem>)  :: float32
28    #   del a
29    #   $const18.7 = const(float, 2.5)  :: float64
30    #   l1 = $16binary_subscr.6 + $const18.7  :: float64
31    #   del $const18.7
32    #   del $16binary_subscr.6
33
34    l1 = a[i] + 2.5
35
36    # --- LINE 24 ---
37    #   $28binary_subscr.11 = getitem(value=b, index=i, fn=<built-in function getitem>)  :: float32
38    #   del b
39    #   $const30.12 = const(float, 0.3)  :: float64
40    #   l2 = $28binary_subscr.11 * $const30.12  :: float64
41    #   del $const30.12
42    #   del $28binary_subscr.11
43
44    l2 = b[i] * 0.3
45
46    # --- LINE 25 ---
47    #   $40binary_add.16 = l1 + l2  :: float64
48    #   del l2
49    #   del l1
50    #   c[i] = $40binary_add.16  :: (array(float32, 1d, C), int64, float64) -> none
51    #   del i
52    #   del c
53    #   del $40binary_add.16
54    #   $const48.19 = const(NoneType, None)  :: none
55    #   $50return_value.20 = cast(value=$const48.19)  :: none
56    #   del $const48.19
57    #   return $50return_value.20
58
59    c[i] = l1 + l2

I.e. in LINE 23 variable a used the last time and numba inserts del a as shown in annotated code in line 28. It means you will see value 0 for the variable a when you set breakpoint at LINE 24.

As a workaround you can expand lifetime of the variable by using it (i.e. passing to dummy function revive()) at the end of the function. So numba will not insert del a until the end of the function.

 5import dpnp as np
 6
 7import numba_dpex as ndpx
 8
 9
10@ndpx.device_func
11def revive(x):
12    return x
13
14
15@ndpx.kernel(debug=True)
16def data_parallel_sum(item, a, b, c):
17    i = item.get_id(0)
18    l1 = a[i] + 2.5
19    l2 = b[i] * 0.3
20    c[i] = l1 + l2
21    revive(a)  # pass variable to dummy function
22
23
24global_size = 10
25N = global_size
26
27a = np.array(np.random.random(N), dtype=np.float32)
28b = np.array(np.random.random(N), dtype=np.float32)
29c = np.ones_like(a)
30
31ndpx.call_kernel(data_parallel_sum, ndpx.Range(global_size), a, b, c)
32
33print("Done...")
 1-----------------------------------ANNOTATION-----------------------------------
 2# File: numba_dpex/examples/debug/sum_local_vars_revive.py
 3# --- LINE 24 ---
 4
 5@numba_dpex.kernel(debug=True)
 6
 7# --- LINE 25 ---
 8
 9def data_parallel_sum(a, b, c):
10
11    # --- LINE 26 ---
12    # label 0
13    #   a = arg(0, name=a)  :: array(float32, 1d, C)
14    #   b = arg(1, name=b)  :: array(float32, 1d, C)
15    #   c = arg(2, name=c)  :: array(float32, 1d, C)
16    #   $2load_global.0 = global(dpex: <module 'numba_dpex' from '.../numba-dpex/numba_dpex/__init__.py'>)  :: Module(<module 'numba_dpex' from '.../numba-dpex/numba_dpex/__init__.py'>)
17    #   $4load_method.1 = getattr(value=$2load_global.0, attr=get_global_id)  :: Function(<function get_global_id at 0x7fcdf7e8c4c0>)
18    #   del $2load_global.0
19    #   $const6.2 = const(int, 0)  :: Literal[int](0)
20    #   i = call $4load_method.1($const6.2, func=$4load_method.1, args=[Var($const6.2, sum_local_vars_revive.py:26)], kws=(), vararg=None, target=None)  :: (uint32,) -> int64
21    #   del $const6.2
22    #   del $4load_method.1
23
24    i = dpex.get_global_id(0)
25
26    # --- LINE 27 ---
27    #   $16binary_subscr.6 = getitem(value=a, index=i, fn=<built-in function getitem>)  :: float32
28    #   $const18.7 = const(float, 2.5)  :: float64
29    #   l1 = $16binary_subscr.6 + $const18.7  :: float64
30    #   del $const18.7
31    #   del $16binary_subscr.6
32
33    l1 = a[i] + 2.5
34
35    # --- LINE 28 ---
36    #   $28binary_subscr.11 = getitem(value=b, index=i, fn=<built-in function getitem>)  :: float32
37    #   del b
38    #   $const30.12 = const(float, 0.3)  :: float64
39    #   l2 = $28binary_subscr.11 * $const30.12  :: float64
40    #   del $const30.12
41    #   del $28binary_subscr.11
42
43    l2 = b[i] * 0.3
44
45    # --- LINE 29 ---
46    #   $40binary_add.16 = l1 + l2  :: float64
47    #   del l2
48    #   del l1
49    #   c[i] = $40binary_add.16  :: (array(float32, 1d, C), int64, float64) -> none
50    #   del i
51    #   del c
52    #   del $40binary_add.16
53
54    c[i] = l1 + l2
55
56    # --- LINE 30 ---
57    #   $48load_global.19 = global(revive: <numba_dpex.compiler.DpexFunctionTemplate object at 0x7fce12e5cc40>)  :: Function(<numba_dpex.compiler.DpexFunctionTemplate object at 0x7fce12e5cc40>)
58    #   $52call_function.21 = call $48load_global.19(a, func=$48load_global.19, args=[Var(a, sum_local_vars_revive.py:26)], kws=(), vararg=None, target=None)  :: (array(float32, 1d, C),) -> array(float32, 1d, C)
59    #   del a
60    #   del $52call_function.21
61    #   del $48load_global.19
62    #   $const56.22 = const(NoneType, None)  :: none
63    #   $58return_value.23 = cast(value=$const56.22)  :: none
64    #   del $const56.22
65    #   return $58return_value.23
66
67    revive(a)  # pass variable to dummy function

Run with environment variables NUMBA_DUMP_ANNOTATION=1 and NUMBA_EXTEND_VARIABLE_LIFETIMES=1. It will show that numba inserts del for variables at the end of the block:

 1-----------------------------------ANNOTATION-----------------------------------
 2# File: numba_dpex/examples/debug/sum_local_vars.py
 3...
 4def data_parallel_sum(a, b, c):
 5    ...
 6    # --- LINE 26 ---
 7    #   $40binary_add.16 = l1 + l2  :: float64
 8    #   c[i] = $40binary_add.16  :: (array(float32, 1d, C), int64, float64) -> none
 9    #   $const48.19 = const(NoneType, None)  :: none
10    #   $50return_value.20 = cast(value=$const48.19)  :: none
11    #   del $2load_global.0
12    #   del $const6.2
13    #   del $4load_method.1
14    #   del a
15    #   del $const18.7
16    #   del $16binary_subscr.6
17    #   del b
18    #   del $const30.12
19    #   del $28binary_subscr.11
20    #   del l2
21    #   del l1
22    #   del i
23    #   del c
24    #   del $40binary_add.16
25    #   del $const48.19
26    #   return $50return_value.20
27
28    c[i] = l1 + l2

Example 3 - Using info locals

Source code sum_local_vars.py:

 5import dpnp as np
 6
 7import numba_dpex as ndpx
 8
 9
10@ndpx.kernel(debug=True)
11def data_parallel_sum(item, a, b, c):
12    i = item.get_id(0)
13    l1 = a[i] + 2.5
14    l2 = b[i] * 0.3
15    c[i] = l1 + l2
16
17
18global_size = 10
19N = global_size
20
21a = np.array(np.random.random(N), dtype=np.float32)
22b = np.array(np.random.random(N), dtype=np.float32)
23c = np.ones_like(a)
24
25ndpx.call_kernel(data_parallel_sum, ndpx.Range(global_size), a, b, c)
26
27print("Done...")

Run the debugger with NUMBA_OPT=0:

$ NUMBA_OPT=0 gdb-oneapi -q python
(gdb) set breakpoint pending on
(gdb) break sum_local_vars.py:22
(gdb) run sum_local_vars.py
...
Thread 2.2 hit Breakpoint 1, with SIMD lanes [0-7], __main__::data_parallel_sum () at sum_local_vars.py:22

Run the info locals command. The sample output on “no optimization” level NUMBA_OPT=0 is as follows:

(gdb) info locals
a = '\000' <repeats 55 times>
b = '\000' <repeats 55 times>
c = '\000' <repeats 55 times>
i = 0
l1 = 0
l2 = 0
__ocl_dbg_gid0 = 0
__ocl_dbg_gid1 = 0
__ocl_dbg_gid2 = 0
__ocl_dbg_lid0 = 0
__ocl_dbg_lid1 = 0
__ocl_dbg_lid2 = 0
__ocl_dbg_grid0 = 0
__ocl_dbg_grid1 = 0
__ocl_dbg_grid2 = 0
(gdb) next
Thread 2.3 hit Breakpoint 1, with SIMD lanes [0-1], __main__::data_parallel_sum () at sum_local_vars.py:22
22          i = dpex.get_global_id(0)
(gdb) next
23          l1 = a[i] + 2.5
(gdb) next
24          l2 = b[i] * 0.3
(gdb) info locals
a = '\000' <repeats 55 times>
b = '\000' <repeats 16 times>, "\n\000\000\000\000\000\000\000\004\000\000\000\000\000\000\000\000\240\016XUU\000\000\n\000\000\000\000\000\000\000\004\000\000\000\000\000\000"
c = '\000' <repeats 16 times>, "\n\000\000\000\000\000\000\000\004\000\000\000\000\000\000\000\000@\256WUU\000\000\n\000\000\000\000\000\000\000\004\000\000\000\000\000\000"
i = 8
l1 = 2.5931931659579277
l2 = 0
__ocl_dbg_gid0 = 0
__ocl_dbg_gid1 = 0
__ocl_dbg_gid2 = 0
__ocl_dbg_lid0 = 42949672970
__ocl_dbg_lid1 = 0
__ocl_dbg_lid2 = 93825037590528
__ocl_dbg_grid0 = 4612811918334230528
__ocl_dbg_grid1 = 0
__ocl_dbg_grid2 = 0
(gdb) next
25          c[i] = l1 + l2

Since the debugger does not hit a line with the target variable l1, the value equals 0. The true value of the variable l1 is shown after stepping to line 22.

(gdb) info locals
a = '\000' <repeats 55 times>
b = '\000' <repeats 55 times>
c = '\000' <repeats 16 times>, "\n\000\000\000\000\000\000\000\004\000\000\000\000\000\000\000\000@\256WUU\000\000\n\000\000\000\000\000\000\000\004\000\000\000\000\000\000"
i = 8
l1 = 2.5931931659579277
l2 = 0.22954882979393004
__ocl_dbg_gid0 = 0
__ocl_dbg_gid1 = 8
__ocl_dbg_gid2 = 8
__ocl_dbg_lid0 = 93825034429928
__ocl_dbg_lid1 = 0
__ocl_dbg_lid2 = 93825034429936
__ocl_dbg_grid0 = 4599075939470750515
__ocl_dbg_grid1 = 0
__ocl_dbg_grid2 = 0
(gdb) print a
$1 = '\000' <repeats 55 times>

When the debugger hits the last line of the kernel, info locals command returns all the local variables with their values.