This is a short writeup/tutorial on low-level debugging a Pixhawk with Ardupilot. It’s based on an issue discussed over at Build Library for Pixhawk 4 - serial outputs can't be displayed, where one of the code examples (UART_test) would cause a system crash of the flight controller.
The tutorial should work with any STM32-based flight controller that has a JTAG port (i.e. Pixhawk).
Hardware Setup
Required Hardware:
- Flight controller with JTAG port
- ST-Link v2 with 10-pin adapter
The Flight controller may require a small plug to solder on to the JTAG port.
Plug the flight controller and ST-Link to each other via the JTAG port (noting that the little notch on the JTAG connector points towards the SD card on the Pixhawk). Plug both devices’ USB cables to your computer.
Software setup
A Linux system is recommended. Windows may be possible, but I’ve not tested that.
It is assumed that the arm-gcc compiler has already been installed. If not, see http://ardupilot.org/dev/docs/building-setup-linux.html
OpenOCD is to be installed next via this command:
sudo apt-get install libusb-1.0-0-dev libusb-1.0-0 openocd
OpenOCD (with the ST-Link) is the bridge between GDB and the STM32 microcontroller on the Pixhawk.
Debugging UART_test
For this example, I used the UART_test example (as of 13/11/2019). This example produced the system crash described below. This code is in ./ardupilot/libraries/AP_HAL/examples/UART_test
Build and upload the debug variant of the UART_test example (noting that the --board
may be different if you’re using a different board)
./waf configure --board=Pixhawk1-1M --debug --enable-asserts
./waf --target=examples/UART_test --upload
Before OpenOCD and GDB are run, their configuration files need to be copied to the build folder. Note that the build
folder’s name is the same at the board’s name in the waf configuration above.
- Go to the
./ardupilot/Tools/debug
folder - Copy
openocd.cfg
to./ardupilot/build/Pixhawk1-1M/bin
- Copy
.gdbinit
to./ardupilot/build/Pixhawk1-1M/examples
- Edit
~/.gdbinit
to have the following text:set auto-load safe-path /
Next open up two terminals to run openOCD and GDB.
cd ./ardupilot/build/Pixhawk1-1M/bin
openocd
cd ./ardupilot/build/Pixhawk1-1M/examples
arm-none-eabi-gdb ./UART_test
The details of the system crash will be shown in GDB:
0x0800542e in HardFault_Handler ()
at ../../libraries/AP_HAL_ChibiOS/system.cpp:94
94 save_fault_watchdog(__LINE__, faultType, faultAddress);
Breakpoint 1 at 0x800549c: file ../../libraries/AP_HAL_ChibiOS/system.cpp, line 172.
Breakpoint 2 at 0x8005404: file ../../libraries/AP_HAL_ChibiOS/system.cpp, line 69.
Breakpoint 3 at 0x801d4cc: file ../../modules/ChibiOS/os/rt/src/chsys.c, line 19
Now, that’s just the watchdog handler. Let’s check the threads via the info threads
command in GDB:
9 Thread 536952680 (Name: IOMCU, State: CURRENT) 0x080076fe in HardFault_Handler () at ../../libraries/AP_HAL_ChibiOS/system.cpp:94
So there is something in the IOMCU thread that is causing the error.
Use the i loc
command to view the code that, when executed, caused the crash:
lr_thd = 0x800bae9 <AP_Logger::WriteV(char const*, char const*, char const*, char const*, char const*, std::__va_list, bool)+28>,
pc = 0x800ba5c <AP_Logger::msg_fmt_for_name(char const*, char const*, char const*, char const*, char const*)+8>,
The IOMCU thread is trying to write to the logfile (AP_Logger) and encountering an error.
Looking back at the UART_test code, note that AP_Logger isn’t initialised anywhere. That’s why the system is crashing when trying to refer to a (non-existent) AP_Logger.
So, IOMCU assumes that there’s a valid AP_Logger. That’s fine for a full vehicle, but not the UART_test example.
We will edit IOMCU to check if there’s a valid logger first. Make the following changes to AP_IOMCU.cpp
to check for AP_Logger befure trying to write:
if (now - last_log_ms >= 1000U) {
last_log_ms = now;
if (AP_Logger::get_singleton()) {
AP::logger().Write("IOMC", "TimeUS,Mem,TS,NPkt,Nerr,Nerr2,NDel", "QHIIIII",
AP_HAL::micros64(),
reg_status.freemem,
reg_status.timestamp_ms,
reg_status.total_pkts,
total_errors,
reg_status.num_errors,
num_delayed);
}
So we upload the changed firmware, debug again and see what happens…
Reading symbols from ./UART_test...done.
0x0800542e in HardFault_Handler ()
at ../../libraries/AP_HAL_ChibiOS/system.cpp:94
94 save_fault_watchdog(__LINE__, faultType, faultAddress);
Breakpoint 1 at 0x800549c: file ../../libraries/AP_HAL_ChibiOS/system.cpp, line 172.
Breakpoint 2 at 0x8005404: file ../../libraries/AP_HAL_ChibiOS/system.cpp, line 69.
Breakpoint 3 at 0x801d4d8: file ../../modules/ChibiOS/os/rt/src/chsys.c, line 198.
(gdb) i loc
ctx = {
r0 = 0x0,
r1 = 0x802c928,
r2 = 0x802c8e4,
r3 = 0x0,
r12 = 0x0,
lr_thd = 0x8008b39 <AP_Logger::WriteV(char const*, char const*, char const*, char const*, char const*, std::__va_list, bool)+28>,
pc = 0x8008aac <AP_Logger::msg_fmt_for_name(char const*, char const*, char const*, char const*, char const*)+8>,
xpsr = 0x21000000,
s0 = 0x55555555,
s1 = 0x55555555,
s2 = 0x0,
s3 = 0x55555555,
s4 = 0x55555555,
s5 = 0x55555555,
s6 = 0x55555555,
s7 = 0x55555555,
s8 = 0x55555555,
s9 = 0x55555555,
s10 = 0x55555555,
s11 = 0x55555555,
s12 = 0x55555555,
s13 = 0x55555555,
s14 = 0x55555555,
s15 = 0x55555555,
fpscr = 0x55555555,
reserved = 0x55555555
}
faultAddress = <optimized out>
(gdb) info threads
[New Thread 536923896]
[New Thread 536939280]
[New Thread 536906352]
[New Thread 536912640]
[New Thread 536907424]
[New Thread 536905280]
[New Thread 536910032]
Id Target Id Frame
2 Thread 536923896 (Name: idle, State: READY) 0x080051a4 in _port_switch_from_isr ()
3 Thread 536939280 (Name: apm_uart, State: WTOREVT) chVTIsArmedI (
vtp=0x20010ac4) at ../../modules/ChibiOS/os/rt/include/chvt.h:243
4 Thread 536906352 (Name: apm_monitor, State: CURRENT) 0x0800542e in HardFault_Handler () at ../../libraries/AP_HAL_ChibiOS/system.cpp:94
5 Thread 536912640 (Name: apm_timer, State: SLEEPING) chVTIsArmedI (
vtp=0x2000a2c4 <_timer_thread_wa+2452>)
at ../../modules/ChibiOS/os/rt/include/chvt.h:243
6 Thread 536907424 (Name: apm_rcin, State: SLEEPING) chVTIsArmedI (
vtp=0x20008e64 <_rcin_thread_wa+916>)
at ../../modules/ChibiOS/os/rt/include/chvt.h:243
7 Thread 536905280 (Name: apm_io, State: SLEEPING) chVTIsArmedI (
vtp=0x200085fc <_io_thread_wa+2444>)
at ../../modules/ChibiOS/os/rt/include/chvt.h:243
8 Thread 536910032 (Name: apm_storage, State: SLEEPING) chVTIsArmedI (
vtp=0x20009894 <_storage_thread_wa+2452>)
at ../../modules/ChibiOS/os/rt/include/chvt.h:243
Another unchecked AP_Logger reference! This time in the apm_monitor
thread.
We will make a similar edit to check if there’s a valid logger first. Make the following changes to ./libraries/AP_HAL_Chibios/Scheduler.cpp
:
if (AP_Logger::get_singleton()) {
AP::logger().Write("MON", "TimeUS,LDelay,Task,IErr,IErrCnt,MavMsg,MavCmd,SemLine,SPICnt,I2CCnt", "QIbIIHHHII",
AP_HAL::micros64(),
loop_delay,
pd.scheduler_task,
pd.internal_errors,
pd.internal_error_count,
pd.last_mavlink_msgid,
pd.last_mavlink_cmd,
pd.semaphore_line,
pd.spi_count,
pd.i2c_count);
}
So we upload the changed firmware, debug again and see what happens…
Reading symbols from ./UART_test...done.
_idle_thread (p=0x0) at ../../modules/ChibiOS/os/rt/src/chsys.c:72
72 static void _idle_thread(void *p) {
Breakpoint 1 at 0x800549c: file ../../libraries/AP_HAL_ChibiOS/system.cpp, line 172.
Breakpoint 2 at 0x8005404: file ../../libraries/AP_HAL_ChibiOS/system.cpp, line 69.
Breakpoint 3 at 0x801d4e4: file ../../modules/ChibiOS/os/rt/src/chsys.c, line 198.
(gdb) i loc
No locals.
(gdb) info threads
[New Thread 536923896]
[New Thread 536939280]
[New Thread 536906352]
[New Thread 536912640]
[New Thread 536907424]
[New Thread 536905280]
[New Thread 536910032]
Id Target Id Frame
2 Thread 536923896 (Name: idle, State: CURRENT) _idle_thread (p=0x0)
at ../../modules/ChibiOS/os/rt/src/chsys.c:72
3 Thread 536939280 (Name: apm_uart, State: WTOREVT) chVTIsArmedI (
vtp=0x20010ac4) at ../../modules/ChibiOS/os/rt/include/chvt.h:243
4 Thread 536906352 (Name: apm_monitor, State: SLEEPING) chVTIsArmedI (
vtp=0x200089cc <_monitor_thread_wa+812>)
at ../../modules/ChibiOS/os/rt/include/chvt.h:243
5 Thread 536912640 (Name: apm_timer, State: SLEEPING) chVTIsArmedI (
vtp=0x2000a2c4 <_timer_thread_wa+2452>)
at ../../modules/ChibiOS/os/rt/include/chvt.h:243
6 Thread 536907424 (Name: apm_rcin, State: SLEEPING) chVTIsArmedI (
vtp=0x20008e64 <_rcin_thread_wa+916>)
at ../../modules/ChibiOS/os/rt/include/chvt.h:243
7 Thread 536905280 (Name: apm_io, State: SLEEPING) chVTIsArmedI (
vtp=0x200085fc <_io_thread_wa+2444>)
at ../../modules/ChibiOS/os/rt/include/chvt.h:243
8 Thread 536910032 (Name: apm_storage, State: SLEEPING) chVTIsArmedI (
vtp=0x20009894 <_storage_thread_wa+2452>)
at ../../modules/ChibiOS/os/rt/include/chvt.h:243
No crashes this time, so this UART_test should now run fine.
Finally, build and upload a non-debug version of UART_test:
./waf configure --board=Pixhawk1-1M
./waf --target=examples/UART_test --upload
And the expected output:
So, with the help of low-level debugging, UART_test is fixed.