Reducing telemetry latency, telemetry scheduling?

paul_arthur · February 12, 2024, 4:00am

this maybe useful.

bobzwik · February 12, 2024, 5:20pm

Hi @Georacer !
I don’t think the _jitter.correct_offboard_timestamp_msec(packet.time_boot_ms, AP_HAL::millis()); corrects for transport latency, but only for jitter. The inputs to this fonction are

“time since boot” (in ms) of the target Pixhawk when it sent the GLOBAL_POSITION_INT Mavlink message,
and the “time since boot” (in ms) of the Pixhawk running the Follow mode.

Since there is no “absolute time” being used, the following Pixhawk can’t know that there is a 160 ms latency between the two Pixhawks. It can however smooth out the jitter, which is when the difference between both millisecond measurements (packet.time_boot_ms and AP_HAL::millis()) keep changing. But this jitter correction turned out to be really important, it really smoothed out the received target location. But on top of this variable jitter, there is a constant latency. Most of this latency comes from the low Mavlink priority: The target Pixhawk estimates it’s position, but then sends it 100-130 ms later. Even the low baudrate of RFD900 (57600) added a significant delay (which was reduced by 25 ms using different radios at 921600 baud).

Well, that’s at least my understanding of _jitter.correct_offboard_timestamp_msec. Please correct me if I’m wrong!

Without taking this delay into account, which was thankfully constant throughout our testing, the drone would fly further and further behind the vehicle. At 50 km/h, a 160 ms delay causes a 2.2 m offset! But once we took that delay into account, the drone always landed on target. This only worked when the vehicle kept a constant velocity, but reducing the latency to 25 ms helped with acceleration/deceleration and turns, but improvements to the controller could be made for better aggressivity in fast forward flight.

For our setup, we use a CUAV C-RTK 2 as our base station, which transmits “RTK corrections” to the drone, who relays them to the target. The drone has a CUAV C-RTK 9Ps (UART) and the target has 2 of the same GPS modules. We use 2 to get the heading, since the compasses are affected by the vehicle’s steel.

It’s pretty incredible that everything worked out smoothly, the GPS modules on the target are in a “moving-baseline” configuration to get the heading, but also are doing RTK from the base station data.

I’ve had issues using the C-RTK 2 unit of the drone and 9Ps on the target, detailed here. I even found out later, due to a badly coded UAVCAN node (by CUAV), that the GPS velocity output was being delayed by 200 ms compared to the GPS position output. This really messed with the drone’s EKF when flying at high velocity. Our switch to the 9Ps on our drone fixed these issues, since it is a UART module, not DroneCAN/UAVCAN.

We also lowered the lower limits of the EK3 GPS accuracy paramaters, to allow the EKF to trust more the RTK GPS for position and velocity estimates, which improved the drone’s flight.

bobzwik · February 12, 2024, 7:33pm

Here’s a visual example of our latency.

There’s a lot in this graph:

The red zone indicates the drone is in Follow Mode
The green zone indicates the drone is in our Follow and Land Mode
The blue line represents the “ground truth” distance between the drone and the target. The POS log structure from the log from both Pixhawk are synchronized using the logged GPS time.
The red line represents the distance between the drone and the target using the received and uncorrected data. This is calculated using the POS log of the drone and the FOLL.Lat/FOLL.Lon from the FOLL log of the drone.
The yellow line represents the distance between the drone and the target using the corrected received data from the target. This correction only includes the jitter correction. This is calculated using the POS log of the drone and the FOLL.LatE/FOLL.LonE from the FOLL log of the drone.
The purple line represents the distance between the target and virtual target, which in this case is 0.8 m behind the target (FOLL_OFS_X = -0.8). This is calculated by the drone in the Follow Mode and added to the FOLL log structure.
The green dotted line represents the moment the drone landed on the vehicle.
The red markers indicate the start and stop of the vehicle’s motion.

So Follow mode is enabled before the vehicle start’s to move, and once the vehicle starts, we clearly see the jitter. The jitter correction performs wonderfully, but starts to deviate from the ground truth. At impact, the drone thinks it is pretty much on target (purple line near 0) but as the vehicle slows down and comes to a stop, the distance to the virtual target increases by nearly a meter, indicating that the drone landed 0.8-1 m behind target. At 19 km/h, that distance is equivalent to about 160 ms.

We added our own code to deal with the latency, and added a parameter for the delay (FOLL_DELAY). Here is an example of the same graph of a test using this delay compensation.

In this case, the yellow line includes the jitter correction and our latency compensation, and it tracks pretty well the ground truth. And after impact, the distance to the virtual target stays unchanged once the vehicle slows down and stops.

We are so thankful that the jitter correction was already part of the code. Without it we would have been stumped for much longer!

Georacer · February 12, 2024, 7:52pm

Very cool graphs! Is that Matlab?

On a similar note, the moving landing code we used was more or less this one, which uses the PrecLand library to build the final target estimate.
There, the parameter PLND_LAG is used to correct for the exact same reason. So we did have access to that parameter and we did verify that it has an effect. However, as we didn’t have access to the logs of the target, it was hard to carry out the comparisons you did.

Did you also make use of the PrecLand module? This is just an FYI, I’m not proposing anything concrete here.

bobzwik · February 13, 2024, 3:41pm

Oh interesting, we knew of the PrecLand library, but decided to proceed with our own landing code. We though the PrecLand library was more aimed towards using visual sensors and lower velocities, and we didn’t need all the complexity of safety checks and failure/retry management.

Which means our code is far from PR worthy, and really only applicable to the trials we were conducting (driving in a straight line at constant velocity). Once landing was triggered, there was no going back.

But we didn’t know about the PLND_LAG parameter. Looking at it now, it seems more like the delay for the visual sensor, but if there is none and the landing is only reliant on received messages, then that parameter represents the communication latency? A similar parameter could be implemented with the Follow mode by default, or maybe the jitter correction could be done using GMS instead of ms_since_boot (but that would require sending an additional message from the target to the drone).

Our land custom Land mode basically reuses the Follow mode, but modified to create a simple maneuver before impact and detecting impact.

And yep, we graph with Matlab! I work in a university research lab. I’m gradually porting my own code to Python, but when making graphs for the prof, it’s Matlab

amilcarlucas · February 13, 2024, 3:50pm

Any of you tested: AC_PrecLand: Use sensor timestamp to match inertial frame corrections by amilcarlucas · Pull Request #18548 · ArduPilot/ardupilot · GitHub ?

Georacer · February 13, 2024, 6:01pm

Might I toot my own horn and ask if you used Ardupilog?

Georacer · February 13, 2024, 10:00pm

I didn’t know of this PR. Do I get it right that the PrecLand backend should implement _backend->los_meas_time_ms();?

Perhaps it’s not very relevant with the original topic of this thread, but definitely interesting for Copter: Feature: Ship Landing by KosmX · Pull Request #24720 · ArduPilot/ardupilot · GitHub.
Not plug-and-play, since the .lua script doesn’t provide the measurement timestamp (I think), but nothing that can’t be fixed.

amilcarlucas · February 14, 2024, 4:22pm

I rebased and improved the PR today, fixed a couple of bugs in corner cases.

bobzwik · February 17, 2024, 5:16pm

Yep! I have use Ardupilog in the past! I didn’t realize that you were the creator!

I had to modify your code to add a datetime array to each log structure to be able to synchronize two flight logs. I was however hitting errors where TimeS or TimeMS had a huge jump, making datetime values millennials in the future. And I couldn’t find the source of the error. I was also using 30+ minute flight logs, with LOG_REPLAY enabled, which made the conversion a bit long. So I eventually wrote my own code to modify the .mat log generated by MP to use a structure format like Ardupilog.