AP_GPS_UBLOX driver produces invalid and missing RAWH and RAWS log messages

When GPS_RAW_DATA is enabled (and UBLOX_RXM_RAW_LOGGING is defined, which is by default the case), the AP_GPS_UBLOX driver will attempt, and in my case fail, to decode U-blox RXM-RAWX messages (“Multi-GNSS raw measurements”). The driver will normally write values from the RAWX message to the DataFlash log using AP_Logger. For each UBX-RXM-RAWX message decoded, one RAWH and one or more RAWS messages will appear in the DataFlash log.

The AP_GPS_UBLOX driver performs a length check when receiving a message. It looks at the length field decoded from the U-blox message, and compares that with the size of a union containing all possible U-blox messages. This ensures that it will automatically reject anything bigger than the largest structure in the union. The union also doubles as the receive buffer. This check is needed to prevent an overflow when receiving a U-blox message.

However, in the struct seen below, the field uint8_t numMeas determines the amount of entries in the svinfo field. In the document “u-blox F9 HPG L1L5 1.40 - Interface description” there is no mention of a maximum. In the AP_GPS_UBLOX driver, the preprocessor define UBLOX_MAX_RXM_RAWX_SATS, which is arguably named incorrectly, is set to 32.

When using an U-blox F9P receiver with helical antenna, when outdoors, numMeas hovered around 60 during tests. This has caused essentially all RAWX messages to be rejected by the message decoder in the AP_GPS_UBLOX driver.

if (_payload_length > sizeof(_buffer)) {
    Debug("large payload %u", (unsigned)_payload_length);
    // assume any payload bigger then what we know about is noise
    _payload_length = 0;
    _step = 0;
    goto reset;
}
struct PACKED ubx_rxm_rawx {
    double rcvTow;
    uint16_t week;
    int8_t leapS;
    uint8_t numMeas;
    uint8_t recStat;
    uint8_t reserved1[3];
    PACKED struct ubx_rxm_rawx_sv {
        double prMes;
        double cpMes;
        float doMes;
        uint8_t gnssId;
        uint8_t svId; 
        uint8_t reserved2; //<-- Should be sigId, more about this second bug below
        uint8_t freqId;
        uint16_t locktime;
        uint8_t cno;
        uint8_t prStdev;
        uint8_t cpStdev;
        uint8_t doStdev;
        uint8_t trkStat;
        uint8_t reserved3;
    } svinfo[UBLOX_MAX_RXM_RAWX_SATS]; // <--Currently 32, turns out to to be too low for F9P receiver
};
union PACKED {
	ubx_nav_posllh posllh;
	//(...) Many other message types here; redacted for your reading comfort
	ubx_rxm_rawx rxm_rawx; // <-- The largest struct in the union
} _buffer;

PS: The naming UBLOX_MAX_RXM_RAWX_SATS implies that this is about individual sattelites, however, as far as I understand it, due to this receiver being marketed as dual-band, it can pick up multiple measurements from a single sattelite.

In my tests I increased UBLOX_MAX_RXM_RAWX_SATS to 256, the maximum possible with the uint8_t numMeas field. After this change, the messages came through reliably. I have flown with this change and have not had any crashes (neither kinetic nor firmware related!).

With this change, the size of the ubx_rxm_rawx struct increases by 7168 bytes, from 1040 to 8208. The same goes for the receive buffer (_buffer), since the ubx_rxm_rawx struct is the largest struct in the union, even with UBLOX_MAX_RXM_RAWX_SATS at 32, which is currently the case.

Sadly, even with this change, attempting to perform PPK produced completely incorrect GNSS solutions. It turns out that there is a second bug. The reserved2 field, which was at some point presumably indeed unused by the U-blox protocol, is nowadays used for the sigId field. See the aforementioned interface description document. In my testing, I have added the field to the DFLog log structure, and set this field to the value of the reserved2 field, which holds the sigId value.


struct log_GPS_RAWS pkt = {
    LOG_PACKET_HEADER_INIT(LOG_GPS_RAWS_MSG),
    time_us    : now,
    prMes      : raw.svinfo[i].prMes,
    cpMes      : raw.svinfo[i].cpMes,
    doMes      : raw.svinfo[i].doMes,
    gnssId     : raw.svinfo[i].gnssId,
    svId       : raw.svinfo[i].svId,
    sigId      : raw.svinfo[i].reserved2,  // <-- Added this line
    freqId     : raw.svinfo[i].freqId,
    locktime   : raw.svinfo[i].locktime,
    cno        : raw.svinfo[i].cno,
    prStdev    : raw.svinfo[i].prStdev,
    cpStdev    : raw.svinfo[i].cpStdev,
    doStdev    : raw.svinfo[i].doStdev,
    trkStat    : raw.svinfo[i].trkStat
};
AP::logger().WriteBlock(&pkt, sizeof(pkt));

In my testing, I noticed that PPK solutions were no longer obviously wrong once this field was present.

For my testing, I wrote a tool which reconstructs the raw U-blow datastream from a DataFlash log file (only the RAWX messages for now). In my testing, I also captured a raw U-blox datastream from an F9P receiver, after letting it be configured by ArduPilot (to ensure exactly identical configuration). Both the artifically reconstructed U-blox raw data, as well as the captured U-blox raw data was fed into RTKLIB version demo5_b34i, (from a fork made by a user called rtklibexplorer). Probably “vanilla” RTKLIB 2.4.3 would have yielded the same results. In my case I perfomed post-processing in Kinematic mode, with base station data from a VRS (network PPK).

If there is any interest, I can share the tool I wrote to reconstruct the U-blox raw datastream from a DataFlash binary log file.

Shall I proceed and write a PR for these 2 issues?

PS Based on my findings, it appears that not many are using the GPS_RAW_DATA parameter these days to perform PPK. My guess is that people use other or older receivers, or they use network-RTK, perhaps by using an NTRIP feed on the base station and feeding that to ArduPilot over the telemetry link, avoiding the need for these RAWX messages to be collected. However, there is a case to be made for PPK when doing mapping. Intuitively, it makes more sense to me to collect all raw data during flight, and perform all processing afterwards.

Screenshot from 2024-03-28 11-47-21
Position data produced by ArduPilot collected without RTK or PPK. Me walking in rectangles with vehicle in hand.
Screenshot from 2024-03-28 11-47-11
Position data reconstructed with only RAWX observations after the 2 bugfixes. U-blox onboard GNSS solutions are completely ignored. This is purely based on psuedorange and other variables for each signal + VRS PPK corrections. GNSS solution calculated by RTKLib.

Screenshot from 2024-03-28 11-46-52
This is exactly the same as above, but with the RAWX sigId field left at 0, which was the case with stock Ardupilot, which did not include that field in the RAWS DataFlash messages.

Hello @joelmeijering , thanks for your work. I am interested in your tool to reconstruct U-blox raw. For the missing info in the log I have created a PR already AP_GPS: add uBlox dual-band RAW logging for PPK by mirkix · Pull Request #25941 · ArduPilot/ardupilot · GitHub .

Regarding your PR, I am surprised to see no mention of UBLOX_MAX_RXM_RAWX_SATS in your PR and commit, did you also have to increase it for the messages to appear in the DFLog?

Regarding the UBX extractor, see GitHub - joelmeijering/dflog2ubx: ArduPilot DataFlash binary log (.bin) to U-blox UBX converter

1 Like

@joelmeijering No, this is a very good catch from your side, I can add UBLOX_MAX_RXM_RAWX_SATS to my PR with credits to you, or you can create a PR for that, what do you thing?

@mirkix

Although after adding the sigId field and increasing UBLOX_MAX_RXM_RAWX_SATS, I was able to produce a Float fix with RTKLIB, which was noticeably better than without the sigId field, I am still in the process of evaluating the entire system in regards to PPK GNSS accuracy and any possible regressions.

For example, my GPS rate dropped substantially with RAWX enabled, which in turn seems to cause navigation problems for Ardupilot. And I am yet to obtain a proper PPK fix. Currently I am testing this throughly.

During testing, more problems may show up that involve code changes. So it may be too early to ask any of the core developers to look at the PR, because we may find more issues, in which case they would have to review it twice.

That being said, I am also not sure about the implications of the 7168 bytes of extra RAM usage that this change will cause. I don’t know if running the full Ardupilot test suite would find all possible memory related implications on all boards. I have not looked into this at all, with certain (build-time or run-time) configurations, it may cause issues.

I noticed there is a preprocessor variable for all GPS_RAW functionality, which by default is enabled. After this increase, for certain boards, it may be needed to build ardupilot with the RAWX support disabled. I don’t know wether this will be needed.

In any case, it would be a good thing if you could add the UBLOX_MAX_RXM_RAWX_SATS increase to your PR. I would suggest 256 as the value. I’ll report back once I have confirmed that I can get a good PPK fix with this change, and that the accuracy is good and that there are no unexpected side-effects. I’m still testing.