Suggestions for diagnosing occasional link 1 down (Rpi, Pixhawk)

Raspberry Pi 4B and Pixhawk 6C connected via USB-A (on the Pi) to the USB-C connection on the Pixhawk. Running mavproxy on the RPi. The RPi receives power from a DC-DC converter that converts pack voltage (50V) to 5v, and, the Pixhawk receives power from the same source (i.e. Pixhawk power in and RPi power in both powered from the DC-DC converter).

Connecting with:

mavproxy.py --state-basedir=$MAVPROXYLOGDIR --logfile=$MAVLOG --append-log --non-interactive --master=/dev/$PORT --baudrate=921600 --out=tcpin:0.0.0.0:14550 --out=tcpin:0.0.
0.0:14551 --out=tcpin:0.0.0.0:14552 --out=tcpin:0.0.0.0:14553 --out=udp:127.0.0.1:14558 --default-modules=arm,mode

We generally run mavproxy 24 hours a day. It is generally stable for hours at a time, but, occasionally, perhaps every 4-12 hours, I get a Link 1 Down error. It always recovers in a few seconds, i.e., it prints this to stdout:

image

@stephendade and others, do you have any suggestions as to where to look for the root cause of this? We checked:

  • Lowering baud rates doesn’t eliminate this (in fact I don’t think it affects this at all)
  • This happens when the rover is sitting stock still, so we do not suspect a loose wire.
  • The autopilot is not out of memory, nor is it out of disk space (dataflash log space) nor is its load ever reported to be > 10%.
  • The RPi isn’t reporting errors in the system log (dmesg, etc.).
  • There are other usb devices on the RPi (intel realsense, and a usb connected LTE modem); we could try disconnecting them to see if this problem goes away.

the “no link” message comes when mavproxy hasn’t received data over the serial port for some period of time, so we are focusing on instrumenting the serial port /dev/ttyACM0 activity.

Any thoughts as to what to try or where to look would be much appreciated!

Thank you!

I think you should troubleshoot the Pi side. Perhaps there is a power management module (or similar) enabled.

Sorry I don’t have a more specific course of action to take.

I am curious as to the resolution for sure!

As I mentioned on Discourse, a tlog of the packets received by the RPi would help in diagnosing the issue.

There’s not many people who run that configuration for 24+ hours continually (I usually do a 1–2 hours), so there’s not much I can offer at this time.

My gut feeling says it might be a software issue somewhere. I would suggest trying:

  1. Flight controller connected over USB to Mission Planner (to check it’s not an issue at the flight controller end
  2. Try using mavlink-router instead of MAVProxy and seeing if that helps

Thanks for the suggestions; Will keep you posted.