On Demand `main_loop_stuck` with MAVLink Serial Control Packet

Summary: MAVLink serial control packet with large timeout for device with _rx_initialised set to false can lead to main_loop_stuck. Is this mismatched expectation with stream->available() inside while loop?

Details:

// libraries/GCS_MAVLink/GCS_serial_control.cpp
more_data:
    // sleep for the timeout
    while (packet.timeout != 0 && 
           stream->available() < (int16_t)sizeof(packet.data)) {
        hal.scheduler->delay(1);
        packet.timeout--;
    }
  • while loop continuously checks stream->available() < (int16_t)sizeof(packet.data). stream->available() will always return 0 when device has _rx_initialised set to false (refer here). Therefore, main_loop is stuck for specified timeout even for devices that will not respond.

  • Relevant fields of MAVLink packet:

MAVLink Protocol (98)
    Header
        Message id: SERIAL_CONTROL (126)
    Payload: SERIAL_CONTROL (126)
        device (SERIAL_CONTROL_DEV): SERIAL_CONTROL_DEV_TELEM2 (1)
        flags (SERIAL_CONTROL_FLAG): 2
            .... ..1. SERIAL_CONTROL_FLAG_RESPOND: true
        timeout (uint16_t): 43690

Firmware Version:

  • ArduCopter v4.4.0 (commit 502702d)

Logs:

  • WatchDog reset
AP: WDG: T39 SL0 FL0 FT0 FA0 FTP0 FLR0 FICSR0 MM126 MC0 IE32768 IEC14 TN:
  • Backtrace
(gdb) bt
#0  GCS_MAVLINK::handle_serial_control (this=0x2001afc0, msg=...) at ../../libraries/GCS_MAVLink/GCS_serial_control.cpp:149
#1  0x0806cb54 in GCS_MAVLINK::handle_common_message (this=this@entry=0x2001afc0, msg=...) at ../../libraries/GCS_MAVLink/GCS_Common.cpp:3881
#2  0x0801511e in GCS_MAVLINK_Copter::handleMessage (this=0x2001afc0, msg=...) at ../../ArduCopter/GCS_Mavlink.cpp:1421
#3  0x0806b56e in GCS_MAVLINK::packetReceived (this=0x2001afc0, status=..., msg=...) at ../../libraries/GCS_MAVLink/GCS_Common.cpp:1662
#4  0x08013e6a in GCS_MAVLINK_Copter::packetReceived (this=<optimized out>, status=..., msg=...) at ../../ArduCopter/GCS_Mavlink.cpp:605
#5  0x0806a786 in GCS_MAVLINK::update_receive (this=0x2001afc0, max_time_us=max_time_us@entry=1000) at ../../libraries/GCS_MAVLink/GCS_Common.cpp:1713
#6  0x0806d30a in GCS::update_receive (this=0x20004d50 <copter+10416>) at ../../libraries/GCS_MAVLink/GCS_Common.cpp:2372
#7  0x08012876 in Functor<void>::method_wrapper<GCS, &GCS::update_receive> (obj=<optimized out>) at ../../libraries/AP_HAL/utility/functor.h:88
#8  0x0805d396 in Functor<void>::operator() (this=0x80c1028 <Copter::scheduler_tasks+740>) at ../../libraries/AP_HAL/utility/functor.h:52
#9  AP_Scheduler::run (this=this@entry=0x200024f8 <copter+88>, time_available=2048) at ../../libraries/AP_Scheduler/AP_Scheduler.cpp:255
#10 0x0805d8a6 in AP_Scheduler::loop (this=this@entry=0x200024f8 <copter+88>) at ../../libraries/AP_Scheduler/AP_Scheduler.cpp:383
#11 0x0805f8b4 in AP_Vehicle::loop (this=0x200024a0 <copter>) at ../../libraries/AP_Vehicle/AP_Vehicle.cpp:308
#12 0x0808f720 in main_loop () at ../../libraries/AP_HAL_ChibiOS/HAL_ChibiOS_Class.cpp:289
#13 0x0808f760 in HAL_ChibiOS::run (this=<optimized out>, argc=<optimized out>, argv=<optimized out>, callbacks=0x2000c7b8 <AP_HAL::get_HAL()::hal_chibios>) at ../../libraries/AP_HAL_ChibiOS/HAL_ChibiOS_Class.cpp:341
#14 0x080136e8 in main (argc=<optimized out>, argv=<optimized out>) at ../../ArduCopter/Copter.cpp:797
#15 0x08011396 in endinitloop ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)