DDS connecting but not publishing topics

Hello,

As per @rfriedman 's comment, I’m starting a new topic for my DDS issue.

To sum up my problem, I am able to install and configure everything correctly (as far that I am aware), but my flight controller (Matek H743 Slim V3) doesn’t seem to publish the topics to my companion computer (Raspberry Pi 4).

If I have the ROS2 micro_ros_agent running and I reboot my FC (either through the Mavproxy reboot command or cycling the FC’s power), the Pi receives topics from the FC for about 7 or 8 seconds (if I have ros2 topic echo /ap/time running, I can see the FC’s time being published and updating until 7-8 seconds), and then it stops publishing. It will restart publishing for 7 seconds if I reboot again. Also, if I terminate the ROS2 micro_ros_agent on my Pi, I’ll see the DDS: No ping response, disconnecting message on console, and if I restart the ROS2 micro_ros_agent before receiving DDS No ping response, exiting, then the same phenomenon will occur (the FC will publish topics, but only for 7-8 seconds.

I’ve tried running the ROS2 micro_ros_agent both with and without mavproxy running. No difference.

In the AP logs, near boot time, I have a bunch of messages DDS: Topic/Sub/Reader session pass for index 'XX' and DDS: Service/Replier session pass for index 'X' and DDS: Initialization passed.

This is what the ROS2 micro_ros_agent terminal looks like, when the flight controller is already turned on and connected:

And here it is after I reboot the FC:

And like a said, topics only get published for 7-8 seconds.

Details on my setup

  • Raspberry Pi 4, Ubuntu Desktop 22.04.5 64bit, ROS2 Humble
    • Serial connection on GPIO 14 and 15. I’ve disabled bluetooth (dtoverlay=disable-bt in /boot/firmware/config.txt), placing UART0 on those pins and addressed as ttyAMA0 (According to RPi docs).
  • Matek H743 Slim V3
    • Serial connection on Rx3, Tx3. On this board, this is SERIAL4 using USART3. I don’t know if there’s an important distinction between UART and USART.
    • Nothing else is connected to the FC. No compass, no GPS, no receiver. Only SERIAL4 and USB (for console). But I’ve tried supplying power via a USB power adapter instead of plugging it into the Pi’s USB port, but the same result occurs.
    • I tried to use the OTG2 (SERIAL8) port for DDS, but I get something similar, but worse. The 2 lines from the first picture appear, but when I reboot, I get connection errors and then only running..., and no published topics, like so:
[1729489148.498787] info     | TermiosAgentLinux.cpp | init                     | Serial port not found. | device: /dev/serial/by-id/usb-ArduPilot_MatekH743_360035001951333231333730-if00, error 2, waiting for connection...
[1729489148.795670] info     | TermiosAgentLinux.cpp | init                     | running...             | fd: 3

Steps to reproduce:
What follows is the record of all terminal commands performed on a fresh Ubuntu install on my Pi. No errors occurred during installations or builds. I’ve done this whole process 3 times (with minor variations, depending on the documentation followed) and I end up with the same issue at the end.

Use Raspberry Pi Imager v1.8.5 to flash microSD card with Ubuntu Desktop 22.04.5 LTS (64BIT) for Raspberry Pi 4
Boot Raspberry Pi, set location as New York (for correct locale settings for ROS2).

sudo apt update
sudo apt upgrade

To enable UART connection on ttyAMA0 (UART0 on GPIO14 and 15)
Add dtoverlay=disable-bt to /boot/firmware/config.txt

sudo systemctl disable hciuart
sudo reboot

Clone Ardupilot repo

git clone --recurse-submodules https://github.com/ArduPilot/ardupilot.git
cd ardupilot
Tools/environment_install/install-prereqs-ubuntu.sh -y
. ~/.profile
sudo reboot

Install ROS 2 Humble

sudo apt update
sudo apt install software-properties-common
sudo add-apt-repository universe
sudo apt update && sudo apt install curl -y
sudo curl -sSL https://raw.githubusercontent.com/ros/rosdistro/master/ros.key -o /usr/share/keyrings/ros-archive-keyring.gpg
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/ros-archive-keyring.gpg] http://packages.ros.org/ros2/ubuntu $(. /etc/os-release && echo $UBUNTU_CODENAME) main" | sudo tee /etc/apt/sources.list.d/ros2.list > /dev/null
sudo apt update 
sudo apt install ros-humble-ros-base ros-dev-tools -y
sudo apt install ros-humble-geographic-msgs
echo "source /opt/ros/humble/setup.bash" >> ~/.bashrc
sudo reboot

Import AP ros2 dependencies (this is from the AP ros2 readme, but I’ve also followed the instruction from AP documentation during another installation trial).

mkdir -p ~/ros2_ws/src && cd ~/ros2_ws/src
wget https://raw.githubusercontent.com/ArduPilot/ardupilot/master/Tools/ros2/ros2.repos
vcs import --recursive < ros2.repos
cd ~/ros2_ws
sudo apt update
sudo rosdep init
rosdep update
rosdep install --rosdistro humble --from-paths src --ignore-src

Install Micro-XRCE-DDS-Gen

sudo apt install default-jre
cd ~/ros2_ws
git clone --recurse-submodules https://github.com/ardupilot/Micro-XRCE-DDS-Gen.git
cd Micro-XRCE-DDS-Gen
./gradlew assemble
echo "export PATH=\$PATH:$PWD/scripts" >> ~/.bashrc

Build colcon

cd ~/ros2_ws
colcon build --cmake-args -DBUILD_TESTING=ON

But on another installation trial, I also performed:

colcon build --packages-up-to ardupilot_dds_tests

Then I connect the FC via USB to the Pi (The only thing configured on the FC is the frame selection and accelerometer calibration), build and upload AP firmware to the FC.

cd ~/ardupilot
./waf configure --board MatekH743 --enable-dds
./waf copter --upload
mavproxy.py --console
param set DDS_ENABLE 1
param set SERIAL4_PROTOCOL 45
param set SERIAL4_BAUD 115
param set LOG_DISARMED 1
reboot

And then finally, I run the ros2 micro_ros_agent and I experience what I’ve described above

cd ~/ros2_ws
source install/setup.bash
cd src/ardupilot/libraries/AP_DDS
ros2 run micro_ros_agent micro_ros_agent serial -b 115200 -D /dev/ttyAMA0

LOGS
I have the outputs for both

colcon test --executor sequential --parallel-workers 0 --base-paths src/ardupilot --event-handlers=console_cohesion+

and

colcon test-result --all --verbose 

colcon test.txt (86.3 KB)
colcon test-results.txt (4.9 KB)

colcon test-result shows that there are no errors, but 65 skipped tests (out of 263). Don’t know if that’s normal.
But in the colcon test --executor sequential ..., there is a warning when testing ardupilot_dds_tests

Starting >>> ardupilot_dds_tests
/home/bobzwik/.local/lib/python3.10/site-packages/setuptools/_distutils/dist.py:261: UserWarning: Unknown distribution option: 'tests_require'
  warnings.warn(msg)
--- output: ardupilot_dds_tests                    
 
----------------------------------------------------------------------
Ran 0 tests in 0.000s
 
OK
---
--- stderr: ardupilot_dds_tests
 
----------------------------------------------------------------------
Ran 0 tests in 0.000s
 
OK
---
Finished <<< ardupilot_dds_tests [1.26s]
 
Summary: 3 packages finished [53.4s]
  1 package had stderr output: ardupilot_dds_tests

I also have a AP log: 18 1979-12-31 7-00-00 PM.bin - Google Drive

Let me know if there’s any other relevant information that you’d need to help me debug this.

Tagging @sstroobants as requested. Their issue seems similar to mine.

@rhys might also be interested.

If you have a USB to serial TTL adapter, can you plug that into your laptop and skip the Pi? I’d like to see if you can isolate which hardware is having the issue.

This article may also help with tracking down issues. It uses the serial link for PPP and then NET_P1 for DDS via UDP. This is my preferred approach for setting up DDS on a companion computer.

Sure! I must have a CP2102 somewhere in the lab, and then I’ll try DDS and PPP/DDS with my laptop (thanks @rhys for the PPP suggestion, I had the misconception that PPP required either an ETH port on the FC or a BotBlox board). However, do you think the fact that I will use WSL will matter, or should I create a Ubuntu partition to dual-boot from?

I’ll try to test this in WSL if you don’t get around it before me. I know for a fact that WSL doesn’t allow access to your webcam unless you recompile the kernel (I ran into this today trying to test some vision opencv stuff with Ardupilot).

As long as you can access the /dev for the CP2102 in WSL it should be ok. The baudrate can’t be fully maxed out without flow control, but it’s still significantly faster than the DDS serial link.

A botblox switch is only required if you want to network other devices. For example I’m networking a RPi4 with a PPP connection to a FC to a Herelink and SIYI A8 (so the PPP link has NET ports set up for DDS, MAVLink, and a UDP client for the SIYI).

Alright. So I found a spare CP2102. I’ve tried 4 things:

  • I switched my SERIAL4 on my FC to run to use Mavlink, and from my RPi 4 UART port (not using the CP2102 in this case), I am able to connect to the FC with MavProxy. So the same hardware config as my initial post, but doing Mavlink instead of DDS
  • Now using the CP2102, on the same SERIAL4 port, I can connect Mission Planner on my Windows PC to the FC.
  • After sharing the CP2102 to WSL, I was also able to connect to the SERIAL4 FC with Mavproxy, on WSL on my Windows PC.
  • From a fresh Ubuntu WSL install, I ran the exact same commands as in my initial post. I switched SERIAL4 back to DDS, and when I ran the micro_ros_agent, the same thing happens as when I am using my RPi4. Rebooting the RPi4 allows for 2-ish seconds of topic publishing.

So at least I’m able to confirm that the FC’s serial port works (at least for Mavlink). But I still can’t confirm if my DDS issue is coming from my FC or from the series of commands I run on Ubuntu. I do obtain a few warnings when building colcon (which I also had on my RPi4), I’ll eventually rebuild and post the warnings if you’re interested.

Otherwise, I’ll try running ROS2/DDS with SITL, now that I’ve setup the environment on WSL. That should tell me if my ROS2 install is faulty or not. And then eventually I’ll look into PPP.

One thing I forgot to mention in my first post:
I added the command sudo rosdep init before running rosdep update, as I got an error trying to run rosdep update first. sudo rosdep init was not mentioned in the DDS guides I followed.
Otherwise, I am unsure if I am properly using the commands:

rosdep install --rosdistro humble --from-paths src --ignore-src

and

colcon build --cmake-args -DBUILD_TESTING=ON

I know you’re probably still in Kaga/Japan, hope you’re enjoying your time there!

Thanks for the info. Back home today:)

I agree on your approach to

  • try running ROS2/DDS with SITL, now that I’ve setup the environment on WSL. That should tell me if my ROS2 install is faulty or not

The build warnings likely won’t matter if it compiled successfully (it did because you are running it).

The tutorials on rosdep cover rosdep init: Managing Dependencies with rosdep — ROS 2 Documentation: Humble documentation
Since the warning is clear when you forget it, I think the AP warnings are clear enough.

I’ll try on my PI and flight controller tomorrow.

1 Like

So I can confirm that DDS works with SITL on my Ubuntu WSL. SITL publishes all the default topics correctly. Both with UDP and virtual UART.
UDP:

cd ~/ros2_ws/
colcon build --packages-up-to ardupilot_sitl
source install/setup.bash
ros2 launch ardupilot_sitl sitl_dds_udp.launch.py transport:=udp4 synthetic_clock:=True wipe:=False model:=quad speedup:=1 slave:=0 instance:=0 defaults:=$(ros2 pkg prefix ardupilot_sitl)/share/ardupilot_sitl/config/default_params/copter.parm,$(ros2 pkg prefix ardupilot_sitl)/share/ardupilot_sitl/config/default_params/dds_udp.parm sim_address:=127.0.0.1 master:=tcp:127.0.0.1:5760 sitl:=127.0.0.1:5501

UART:

socat -d -d pty,raw,echo=0 pty,raw,echo=0
cd ros2_ws
source install/setup.bash
ros2 launch ardupilot_sitl micro_ros_agent.launch.py transport:=serial baudrate:=115200 device:=/dev/pts/3
cd ros2_ws
source install/setup.bash
ros2 launch ardupilot_sitl sitl.launch.py synthetic_clock:=True wipe:=True model:=quad speedup:=1 slave:=0 instance:=0 serial1:=uart:/dev/pts/4 defaults:=$(ros2 pkg prefix ardupilot_sitl)/share/ardupilot_sitl/config/default_params/copter.parm,$(ros2 pkg prefix ardupilot_sitl)/share/ardupilot_sitl/config/default_params/dds_serial.parm sim_address:=127.0.0.1

cd ros2_ws
source install/setup.bash
ros2 launch ardupilot_sitl mavproxy.launch.py master:=tcp:127.0.0.1:5760 sitl:=127.0.0.1:5501

Using the ros2 launch for the creation of the virtual ports as (per instructions here) gives me a Permission denied error, so I used a different socat command and virtual ports

So… If everything is correctly setup in WSL, that should be also be the case on the RPi4 (since I ran exactly the same commands on WSL and my RPi4, from fresh Ubuntu installs)? And since both WSL and my RPi4 can establish DDS communication with the FC, but publishing stops after 2-6 seconds, then the problem would be the flight controller?

Does the FC need to be fully setup for DDS to work? From a fresh set of parameters, I’ve

  • enabled DDS and configured the serial port (SERIAL4),
  • selected the frame and AHRS orientation,
  • done the accel and compass calibration,
  • enabled log_disarmed.

There is no RC connected nor battery monitoring, so I am obviously getting pre-arm warnings. But the topic publishing stops a long time before receiving the first Mavlink pre-arm warning. The duration of the topic publishing varies every time i reboot the FC.

Maybe I’ll try the SERIAL1 port? So Rx7 and Tx7 on the MatekH743.

So I tried DDS again with my RPi4, but this time over the OTG2 port (SERIAL8 on the MatekH743) and a USB cable. And it works! Everything is published normally.

So DDS does in fact run on my FC, but for some reason, it doesn’t when I use SERIAL4. Either:

  • there is a limitation with SERIAL4 (surely using a USART port shouldn’t be an issue?)
    or
  • the serial port on the RPi4, and the CP2102 link to WSL are both badly configured.

I’ll try SERIAL1 soon (which is a UART port, not USART)

Can you check your serial bandwidth usage on the FC while DDS is running? Feel free to upload logs (LOG_DISARMED=1, LOG_REPLAY=1)

Here is a link to a folder with 6 logs, all with Replay enabled:
https://drive.google.com/drive/folders/1-RmADn2ORyPXVyGTAln3wwXYpDN8L_pz?usp=sharing

  • Using SERIAL4, 115200 baud, with Mavproxy running via USB (SERIAL0)
  • Using SERIAL4, 115200 baud, without Mavproxy
  • Using SERIAL4, 921600 baud, with Mavproxy
  • Using SERIAL4, 921600 baud, without Mavproxy
  • Using SERIAL1, 921600 baud, with Mavproxy
  • Using SERIAL1, 921600 baud, without Mavproxy

I don’t know why the difference, but on SERIAL4, the FC kept publishing for between 9 and 18 seconds. But on SERIAL1, I got 23 and 35 seconds of publishing (I had unfortunately celebrated too soon when it was still publishing after 20 seconds).

I’m not sure what the units are, but I have 12x10^3 (so 12 KB/s? kb/s?) with a 115200 baud and 68x10^3 with 921600.

Note:
Downloading logs with the MatekH743 seems faster than with my CubeOrange+, via USB and Mission Planner. But I get warnings for “PreArm: Main loop slow (323Hz < 400 Hz)”. This often reboots the FC and cancels the download. But always, during download, the FC stops and starts logging repeatedly (due to LOG_DISARMED), creating multiple 8 to 40 kb logs. Is this normal behavior, and can it be linked to my issue?
I however do not see any Long Loops in the logs, CPU usage is under 200, and obviously, the FC doesn’t reboot during my DDS trials, nor does the logging stop and start.

I added a 7th log, I had forgotten to make one using SERIAL8 (OTG2), which works. I let it run about 1.5 minutes before killing the micro_ros_agent on the RPi4.

I was able to reproduce my issue with a spare Cube Black FC on SERIAL1. It only publishes topics for 2 seconds before stopping. Hopefully you can also reproduce it.

I followed your guide, and using PPP solves my issue. I’m able to publish topics without stopping now, through SERIAL1 on my Matek FC to a UART port on the RPi4.
(For anyone using a Raspberry Pi, I also had to sudo apt install linux-modules-extra-raspi to be able to use pppd)

Quick question @rhys. Is your choice of IPs 181 and 60 random, or do they have a meaning? I’ve never touched networking, so I’m pretty sure I’ll be scratching my head for the foreseeable future (I am going to have to learn about network routing soon enough). I’m also pretty new to ROS. I’ve worked around colleagues that use ROS, but since I have a mechanical engineering background, I haven’t tinkered with it myself.

Now, time to learn how to disable/enable topics and their frequency (I saw that you talked a bit about that at the conference @rfriedman)

If you do find a solution for my issue of running DDS without PPP, I’ll be happy to help try it out.

Thanks for the help!

Hi @bobzwik, great news that PPP is working and solves the DDS stability for you. There may be a residual problem with the serial support, I’ll need to switch one of my builds to investigate in more detail, however with the availablility of PPP and a new generation of adapter / switches coming out that support PPP and networking I wonder whether the serial interface will continue to get much use? (see tridges talk on day one of the conference for a great update on the state of core AP systems and connectivity support).

Pretty much arbitrary. 181 is the IP assigned to my macbook on home wifi, and 60 for the FC is in a range I’ve allocated for static IPs on the home router. You can choose what you like. There are some restrictions on the subnet for certain devices. For example my copter has a Herelink v1.1 AirUnit, and that requires the subnet on the vehicle to be 192.168.144.x/24 (the guide was written before that was set up in my case, so the subnet is different).

1 Like

See the bottom 3 blocks on how to do it. I have a wiki coming in soon that will document this.

1 Like

I’m continuing with PPP, due to it’s advantages, but I continued to debug a couple other things with DDS over serial.

  • I disabled the Linux serial console, which is by default enabled to run on the Pi’s primary UART (I think, at least in Rasbian, but not sure in Ubuntu). I used sudo systemctl disable serial-getty@ttyAMA0.service. But it certainly seemed to improve the serial connection, as I could reach between 50 and 200 seconds of topic publishing. But it would eventually always stop.
  • I talked to @lanealucy, and she suggested I use flow control (which she uses with her FTDI, but for DDS over PPP). Took me a while to figure out how to properly set up flow control on the RPi4, but when I did, DDS over serial works perfectly!! I’m currently running a long duration test, and I’m up to 40 minutes without a disconnection. Thanks @lanealucy!

And flow control also works with PPP. However, I expected a data rate improvement when using flow control. I was interested in a data rate improvement, because, when using the default DDS topic and delay parameters, the topics are not published at the desired rate. For example, /ap/clock is published at 83 Hz instead of the desired configured 100 Hz, and /ap/imu/experimental/data at 145 Hz instead of 200 Hz.

A few tests for the publish rate of /ap/imu/experimental/data

  • PPP, 921600 Baud, with or without flow control: 139-141 Hz
  • PPP, 2000000 Baud, with or without flow control: 144-146 Hz
  • Serial, 2000000 Baud, with flow control: 160-162 Hz

I am certainly not saturating the serial connection. bmon shows a usage of 64 KiB/s (65.5 KB/s) for PPS, far from the theoretical max of 200 KB/s for a 2 000 000 baud. So I was just wondering if the bottleneck is more on Ardupilot’s side, and if it is not capable of sustaining publishing topics with the default DDS config?

EDIT:
Doubling the baudrate (to 4MBaud) did not change the publishing frequencies.

Hi @bobzwik

That’s good information.

Doubling the baudrate (to 4MBaud) did not change the publishing frequencies.

I don’t think the bandwidth / baudrate of the PPP or serial link is the limiting factor when publishing DDS topics from ArduPilot. It’s the scheduler priority (and the fact that all topics are published on the same thread).

We have not done a lot of testing of desired vs achieved rates, so this is something we can look into further.

2 Likes