GPS: Inconsistent handling in Ardupilot and Zubax/Uavcan/AP

In the course of implementing a GPS node for my Uavcan-For-Hobbiests project I found the handling of the fix status and gnss timestamping quite unclear, and thus spend a whole day just to inspect the zubax, AP-Uavcan and AP native GPS codes for these parts.

I’ve noted some inconsistencies, which I want to bring to attention. I lack the knowledge to judge on whether these are of practical relevance or not; I suspect they might lead to different behavior in e.g. GPS outages. I describe what I see from looking at the code, which implies that I might have read it incorrectly.

I refer to a ublox m8 gps, and compare its behavior when either connected natively (AP_GPS_UBLOX.cpp) or when a zubax gps is connected via uavcan (zubax_gnss…\ublox.cpp and AP_UAVCAN.cpp)

The inconsistencies are related to how the fields “valid” and “flags” of the ublox ubx NAV-PVT message are handled, or not handled. I’ll discuss that separately for the fix and the timestamp issue.

I refer to the document "“u-blox 8 / u-blox M8 Receiver Description Including Protocol Specification”.

FIX

zubax
runs the ubx packets NAV-PVT, NAV-SOL, NAV-DOP, NAV-SAT, NAV-TIMEGPS; fix state is taken from NAV-PVT.
ardupilot
for m8 it runs the ubx packets NAV-SOL, NAV-PVT, NAV-DOP; fix state is taken from NAV-PVT.

NAV-SOL, NAV-PVT, NAV-DOP all have a navigation flag field “flags”.

For NAV-STATUS the ublox docu says: “GPSfix Type, this value does not qualify a fix as valid and within the limits. See note on flag gpsFixOk below.” Such a comment is not found for NAV-SOL and NAV-PVT. However, one would think that this comment equally applies to NAV-PVT and NAV-SOL, especially as both also have a flags field.

I note this:
Neither the zubax code nor the AP code evaluate the flags field in determining the fix state.
(zubax: https://github.com/Zubax/zubax_gnss/blob/master/firmware/src/board/ublox.cpp#L374-L394)
(ap: https://github.com/ArduPilot/ardupilot/blob/master/libraries/AP_GPS/AP_GPS_UBLOX.cpp#L956-L987)

However, for the NAV-SOL and NAV-STATUS messages the AP code DOES evaluate the flags field! That is, for a non-PVT (e.g. m6) gps, where NAV-SOL is used, the AP code behaves differently than for a m8 gps.
(NAV-SOL: https://github.com/ArduPilot/ardupilot/blob/master/libraries/AP_GPS/AP_GPS_UBLOX.cpp#L875-L890)
(NAV-STATUS: https://github.com/ArduPilot/ardupilot/blob/master/libraries/AP_GPS/AP_GPS_UBLOX.cpp#L914-L929)

My 2 cents: I would think that that the “flags” field should also be considered for NAV-PVT.

GNSS TIME

zubax
runs the ubx packets NAV-PVT, NAV-SOL, NAV-DOP, NAV-SAT, NAV-TIMEGPS; gnss timestamp is taken from NAV-PVT (only UTC used)
ardupilot
for m8 it runs the ubx packets NAV-SOL, NAV-PVT, NAV-DOP; gnss timestamp is taken from NAV-SOL (week) and NAV-PVT (TOW)

AP seems to store the GPS time in the variables state.time_week and state.time_week_ms, which are filled from NAV-SOL and NAV-PVT, respectively, for a natively connected m8.

NAV-PVT has a Validity Flag “valid”, but not so NAV-SOL. The “valid” field indicates the validity of the time and date.

I note this:
zubax sets gnss.timestamp only if (pvt.valid & UtcValidFlags) == UtcValidFlags, that is, it evaluates the valid flag. In the AP UAVCAN module the state.time_week and state.time_week variables are set only if status == STATUS_NO_FIX and gnss_time_standard == GNSS_TIME_STANDARD_UTC. Thus, effectively, in setting the GPS time both the valid flag and the fix status are evaluated.
(zubax: https://github.com/Zubax/zubax_gnss/blob/master/firmware/src/board/ublox.cpp#L404-L422)
(AP_UAVCAN: https://github.com/ArduPilot/ardupilot/blob/master/libraries/AP_UAVCAN/AP_UAVCAN.cpp#L64-L84)

In contrast, for a m8 gps connected natively neither the valid flag nor the fix state are considered.
(week: https://github.com/ArduPilot/ardupilot/blob/master/libraries/AP_GPS/AP_GPS_UBLOX.cpp#L906-L913)
(TOW: https://github.com/ArduPilot/ardupilot/blob/master/libraries/AP_GPS/AP_GPS_UBLOX.cpp#L1016)

However, for a non-PVT (e.g. m6) gps, when AP extracts the time completely from NAV-SOL, both variables are set only if next_fix >= AP_GPS::GPS_OK_FIX_2D.
(https://github.com/ArduPilot/ardupilot/blob/master/libraries/AP_GPS/AP_GPS_UBLOX.cpp#L934-L938)

One thus obtains different behavior for the three cases, zubax, m8 natively, m6 natively.

My 2 cents: AP obviously must already be able to handle “jumps” in the gnss time. I would thus think that the correct gnss time (=derived from a gnss) should be used as soon as it’s available, i.e., when the valid flag suggests this, but would not add additional dependencies such as on the fix state.

Comment: Also the case when the gnss time is rejected/invalid should be considered. I note this:
zubax: If the UTC is not yet valid, it seems to deliver a “self-made” UTC time.
AP UAVCAN: If the timestamp is not of type UTC, it doesn’t seem to accept any time stamp. Not sure what time it when uses. IS THIS A BUG?

Very confusing, all this. To me it looks as if no one with sufficient knowledge (= not me) ever went over this consistently.

That’s what I concluded, might be correct or not.
Cheers, Olli

Actually this is simply not implemented yet.

Did not understand this at all, sorry. The time stamp is saved if there is ANY fix available. If there is no fix - time stamp is not evaluated.
But as you stated, Zubax generates only UTC time. So for now as it is the only one more or less available UAVCAN GNSS receiver, functionality was not extended.

Alright, given the number of incremental add on passes that have happened I can easily believe inconsistencies have crept in. I’ve only skimmed the post so far but a couple of off the cuff comments:

@peterbarker and I were discussing (and pretty sure I’m going to work up the patch set soon) to remove NAV-SOL from ArduPilot if you have PVT support to fix a race condition on GPS week (and save the bandwidth).

Regarding the flags. This is a bit interesting, what it specifically means in this case is that the navigation filters on the GPS. If you notice we set the elevation mask to -100 (unless the user has overridden that with a parameter, which I’ve never actually seen a log where this was true). We don’t load and DOP limits. This actually means the fix ok flag is undefined (as we don’t know what DOP limits are being applied). The goodnes though is that it turns out to not matter. All that type of filtering is already being done within ArduPilot (inside the EKF for instance which is always evaluating if the GPS is good enough to be used).

GNSS Time is a disaster area at the moment. It’s not just the u-blox that handles this poorily, all the GPS’s are configured differently for this, and pull data from different sources. Using UTC time is actually problematic as we can easily be missing the correct conversion information, and you potentially have up to 12.5 minutes before you get the data. Obviously waiting that long isn’t feasible, so we need to handle this better. The past discussions have revolved around the concept of rolling back towards using GPS time and applying the leap seconds from the code, as it should minimize error, and almost every GNSS user will be using GPS at a minimum. But thats definetly not a full/correct solution, and has weaknesses. I welcome proposals that would allow us to unify the time handling sanely for all platforms. (Of particular note, iTOW’s should not be used for calculating the system time, but that usage is rampant within the u-blox driver.)

@EShamaev:
yes, this is so for the chain m8/zubax->uavcan->ardupilot, but is not so for the chain m8->uart->ardupilot, see the code links
remind you, I’m pointing out differences in the behavior between these two scenarios, I’m not commenting on whether a behavior is correct or not :slight_smile:

I do, btw, have a prototype gps uavcan node on my workbench, which I think is fully working (no flight test yet), which allows choosing between gps time and utc time stamping, and would have preferred gps time … :smiley:

@WickedShell:
as mentioned before I can’t comment on whether something is correct or not, that’s beyond my competences, I just see differences. However, for one and the same gps receiver (e.g. m8) I would expect one and the same behavior, independent on how it is connected to ardupilot. That’s at least what makes sense to me, and the above should be seen in this light.
Not sure which race condition you mean. I got the feeling that one can trust that NAV-SOL is coming before NAV-PVT, for one solution. With that one should be able to resolve any race conditions (i.e. be able to identify which week belongs to which TOW). For curiosity, where from would you take week instead?stupid me, NAV-TIMEGPS of course … wouldn’t this be the approach to resolve any race conditions anyhow
Are you implying that AP’s behavior is essentially independent on the value of the fix? To me it looks that this is consumed at several places for several uses, which would lead to different behavior even if the EKF would not be using it.
Yes, I have seen that 12.5 min being mentioned too, and wondered why one would use UTC, but it is apparently used by zubax, and zubax is used, so …

It is a nice coincidence that you know what’s inside Zubax and have code to compare. And another coincidence that it is also uBlox, same IC as generally used as a standalone GNSS receiver with AP.
However I prefer to treat all devices connected with UAVCAN to AP as black boxes, as this will be the situation in future as I expect when number of those devices grow. So my main task now is to support UAVCAN standard messages and their interpretation as per standard.
If you would like to add some features, I most definitively welcome you to do so and will be glad if you can help with extending the support for CAN.

well, the “coincidence” I found actually quite time consuming … writing the base code took a day, the above just on fix and gnss timestamps another day … if there would have been just a datasheet I would have been done in 1/2 h, and it’s not (yet) clear it would have been worse … :slight_smile:

I understand well that you first want to get the standard implemented, that’s certainly the appropriate approach. However, it’s a double-sided sword … the standards effectively boil down to the least common denominator, which means the full potential of a device is not exploited properly. They also tend to duplicate programming time. There had been a related discussion also in the gps get_lag() thread. I btw have a quite clear opinion on this matter LOL.

As regards features, I would have a style suggestion: I do not like this very deeply nested code, IMHO it’s less clear. It seems AP doesn’t also, I note that the style in the UAVCAN parts appears to be noticeably different form what I have seen else in the AP codes. Maybe you would want to write e.g. if( !blabla ){ return; } instead of if( blabla ){ long nested code; }. As a very brief example, magnetic-cb() could look like this:

static void magnetic_cb(const uavcan::ReceivedDataStructure<uavcan::equipment::ahrs::MagneticFieldStrength>& msg)
{
    if (hal.can_mgr == nullptr) {
        return;
   }

   AP_UAVCAN *ap_uavcan = hal.can_mgr->get_UAVCAN();

   if (ap_uavcan == nullptr) {
        return;
    }

   AP_UAVCAN::Mag_Info *state = ap_uavcan->find_mag_node(msg.getSrcNodeID().get());

   if (state != nullptr) {
       state->mag_vector[0] = msg.magnetic_field_ga[0];
       state->mag_vector[1] = msg.magnetic_field_ga[1];
       state->mag_vector[2] = msg.magnetic_field_ga[2];

       // after all is filled, update all listeners with new data
      ap_uavcan->update_mag_state(msg.getSrcNodeID().get());
    }
}

As said, looks much more AP-style to me. :slight_smile:

All this discussion on how programming time is doubled or full potential is not exploited is only because now we have only one available GNSS CAN device which is open source and happened to use same hardware as AP. I fully agree with Pavel’s opinion to have one strictly defined protocol for communication.
I am sure that in future there will be more GNSS available and in case they are not open source and open hardware - there will be nothing you can do. But if they comply with UAVCAN standard - you will be have a working system.
But if someone will come up with message definitions of UART/SPI/I2C transfer, I will add it to AP. But I still would prefer it to be available only for developers as I see a major safety issue with this concept.

As for coding style, I agree to your point, however generally I am against having many exit points scattered in a function. I prefer to have single entry/single exit concept where possible/reasonable.

we all have our own coding habits and preferences … it was just that when I looked at the code the first time I was struck by how obviously different it is compared to other AP codes I had visited before … that’s all I wanted to say, not very relevant … if you devs are happy, we all are happy, right :slight_smile:

I’m not sure I correctly understand “I am sure that in future there will be more GNSS available and in case they are not open source and open hardware - there will be nothing you can do.”

If such future GNSSes do their own uavcan messages, they of course can’t expect to work with ArduPilot right out of the box, or expect to have a “right” to be added to ArduPilot, but I would think they would have a decent chance to be added to ArduPilot … I mean, you have code for ublox m6, ublox m8 (both not OS nor OH), I don’t know how many accelerometers, magnetometers, leds, etc. pp. … there are even two codes for the STorM32 controller (how curious is that LOL) … so, I would hope that there is something one could do also for those future GNSSes :slight_smile:

If the intention/decision is to not apply that practice to uavcan devices, I think you (= AP folks) should release a very clear statement somewhere on that. I guess it could avoid potential irritations.

BTW: I’ve never specifically referred to UART/SPI/I2C transfer, but stayed more general, to allow for vendor-specific messages, e.g. AP messages which others are free to support or not … the UART/SPI/I2C transfer could have been one consideration, but if you read back there had been other suggestions too. Maybe that didn’t come across clearly enough.

As for coding style it is not a habit actually, but this standards are not used in AP, so I did relax my coding style already quite a bit.

As for GNSS I did not mean other messages. I mean that world does not end on uBlox, there are much more professional solutions around. And most probably devices that might be based on those will support UAVCAN just as per standard. And if we want to use it - chances are we have to live with how it is done without much influence.

Vendor specific messages are supported in UAVCAN and anyone can add anything. The other question is if that is really necessary or meaningful.

Just like I said before - you are more than welcome to contribute to AP and most probably to UAVCAN (I don’t think Pavel will mind that at all).

“As for coding style it is not a habit actually,” … LOL … well … I guess this would be a discussion with open end, like “Is Windows better than Unix or vice versa?” …

I hear you as regards more actively contributing to AP, but I realize that within my given time that’s not really possible, and the hill for code contributions is too high for me (I’ve started more than once reading the “How to contribute code” but never finished, being overwhelmed, way over the top of my head). Thx for the invitation though. I’ll sometimes make suggestions, you’ll do whatever you want, I think that’s not too bad a scheme. :slight_smile:

It’s not. There are coding standards for mission critical systems which are MISRA, DO-178 to name few. And they have very severe limitations and standardization of coding style.

as you say, there are (> 1) coding standards for mission critical systems (!= all systems) …

I surely get your point though, and it’s admirable that you code according to such standards

my code qualifies to the univers-al OW-42 standard, which I’m not sure is a superset of MISRA, and might be even disjunct LOL

besides that: I still think it could be good to clarify how or if at all the various flags discussed in the above should be used. Should one follow Zubax, should one follow AP, should anyone follow whomever?

As I said, I relaxed my coding a lot compared to standards. But with something I prefer to stay with it.

Flags that make reason will be supported.

ähm, it deems to me that I wasn’t sufficiently clear, I meant the fields “valid” and “flags” in the ubx message(s) …

Is there a final consensus, among you/AP and zubax/Pavel, on how these should be evaluated/considered?

I still do not understand what you mean.

NO_FIX, STATUS_TIME_ONLY, FIX_2D and FIX_3D are already considered. If there is a valid time - it will be processed with any of flags and will be used by AP.
Coordinates are considered valid if there is a FIX.
So considering time processing, UAVCAN has the best solution so far.
Also WickedShell reported that NAV-SOL will soon be removed from AP.

So the final consensus was from the very start and it is in UAVCAN DSDL.

???
you leave me confused, I don’t see how that is related to DSDL, it seems the first post didn’t clarify what I think could need a clarification … not sure I can do better
anyway, for me it was an insightful discussion anyway, so thx for that :slight_smile: