GPS glitch leading to flyaway; many EKF errors

Rev: ArduCopter V3.6.7 (38b8af17)
Airframe: 420-class quad
FC: Pixhawk 2

I’d dearly like to post the dataflash log from this flight, but my employer won’t permit it. At any rate, the flight began fairly normally (flying between two altitudes above the same location, using a T & RH sensor to record an atmospheric profile). The sensor also had its own GPS and active antenna. Current suspicion is on the low-noise amplifier in the active antenna, which we believe started oscillating (not uncommon for low-current LNAs) and trashed the Pixhawk’s GPS receiver (a uBlox NEO-M8).

At the time the anomaly started, the Pixhawk’s GPS receiver seemed to bounce between “good fix” and “no fix”, because after the initial “Err: GPS-1” message, the rest of the flight was accompanied by alternating “Err: EKF_PRIMARY-1” and “Err: EKF_PRIMARY-0” messages with an interval of about 3-4 seconds between them. During this time the HDOP bounced between 1.0 and 100 in the same way, and I have a hunch that this was the source of the EKF craziness. The FC had been in Auto mode at the time, but it never did answer to my switching to Stabilize, something which upset me a lot, because I thought that was my “dead-man switch” for this sort of emergency.

The only thing different with this flight was the introduction of the sensor and its GPS receiver/antenna. Prior to this, the same airframe had flown roughly 8 cumulative hours of the same flight profile without incident. Props are well-balanced, with vibe numbers never exceeding 3.

I understand that most people want a flight controller which will resume using a GPS fix once a glitch clears, but plainly the firmware can’t anticipate such an extreme situation as I describe here. I was left wondering, though, whether there might be a parameter I can tweak to make the code slower to resume trusting the GPS data.

Log needed to try to diagnose,

Sorry about that, I wasn’t hoping for a diagnosis. Also, the log is not forthcoming due to my employer’s wishes. I thought I made it clear that this was a narrative of negative outcome and a plea for for suggestions on how to make the next flight less awful.

Understood regarding the logs. So we’ll need some additional information from you instead. Regarding switching to stabilize mode, what happened instead? What did the copter do? Did the log show the mode switch changing? Did the log show mode change errors? Did the log show anything at all while you were trying to retake control? Are you sure the receiver wasn’t out of range or being adversely affected the same way the GPS was? You should be able to look at the RCin log data to see if it was actually receiving those mode change commands.

My initial suspicion is that the receiver wasn’t sending the mode change to the FC.

Again, I apologize for not being able to post the log. Actually, just now looking at the RCIN value for the mode input, I see that I put it into RTL rather than Stabilize, so I was wrong to say the firmware didn’t respond to the mode change.

I’m not trying to bust on Ardupilot, and I realize diagnosis under the conditions my employer is requiring here limits what can be achieved. The fact that the aircraft made a more-or-less controlled descent to the ground while faced with a crippling set of inputs is testimony to the work that’s gone into the code. My chief intent was to mention the phenomenon of LNA oscillation (which I’ve seen now on two separate and different occasions), and to seek input on parameters which might reduce the coupling between the EKFs and (possibly-compromised) GPS data.

There’s an article floating around the internets, with a study upon our run-of-the-mill 25x25 mm ceramic antenna and the issues it faces in the current GNSS environment.The article states that, while being the perfect solution in weight and cost for GPS (and Galileo, as it shares the band) in a GPS/Glonass environment it struggles and is actually “saved” by the sensitivity of the receiver, LNAs and filters notwithstanding.

I’ve experienced multiple GPS Glitch events during the same flight, and a screenshot of the behaviour is posted on the forums here. Basicly, when the drone faced north and leaned north, accelerating, dipping the front-mounted GPS antenna and raising the body behind, NumSats started flickering by 3-4, HDOP by 0.3-0.4 and a parameter called VZ took a beating, falling sharply, so my drone soared to 3-4x the programmed altitude.
Folks blamed my vibes leverls, but I redid my mission with front facing south, and it completed without a single glitch, so I kinda’ knew ith was the GPS. In the meantime I’ve appropriated a DJI A2 GPS, with its Tallysman dual-band dual-feed antenna, soldered out the proprietary canbus and exposed regular UART/I2C GPS/mag pads, and flew like a champ ever since.
I still experience some sat/hdop loss facing and accelerating north, but it’s not sharp enough to generate a GPS Glitch event.

After putting the aircraft back together, I flew four successive vertical profiles today with no anomalies, with the same payload, but with their GPS antenna modules removed. I’ll use the barometer to get altitude anyway, so I’m not losing anything by doing so. I’m pondering also getting a USB-based spectrum analyzer to sniff around new payloads near the GPS L1 frequency.

Incidentally, I’ve been working on issues related to the EKF’s ability to handle GPS issues. Turns out, there’s a couple bugs and some less-than-ideal behavior that can be improved (the issue has no responses, but work is being done in the background). For example, there’s a bug with the GPS glitch state clearing which can cause the drone to zoom off in some direction. If you describe how the drone physically responded to the error, I might be able to guess if you experienced any of the failure modes I’m familiar with, and how to deal with them.

more-or-less controlled descent to the ground

This sounds like Land-without-GPS-mode, which is an EKF failsafe that triggers in the event of bad GPS. This mode behaves essentially like alt-hold, except that the drone will keep descending. It will only leave this state if you switch to a non-GPS flight mode.

suspicion is on the low-noise amplifier in the active antenna […] I’m pondering also getting a USB-based spectrum analyzer to sniff around new payloads near the GPS L1 frequency.

Interesting theory. My advice if you use a spectrum analyzer is to make sure you perform your examinations with everything turned on and operating as they would in flight (motors excluded), and physically move the payload around. In my GPS jamming tests, most devices seem completely benign but may become a problem when set near another device at a particular distance or orientation. It’s rare, but I’ve observed a few interesting interactions.