3DR X8+ Fly Away due to EKF_CHECK-2, FAILSAFE_EKF-1

A local University has asked for my help in diagnosing a recent fly-away and crash of their stock 3DR X8+ copter. Although I can find several instances of operator error, I need help understanding the root cause. Data flash logs were provided, but no .tlog.

This was the first test flight since loading the latest stable firmware 3.4.6. They reloaded the .param file, and ran through Mission Planner’s set-up wizard. Here is the brief narrative:
I was testing from midfield of a football field at UH, with a set of metal bleachers about 50 yards west and a set of concrete bleachers about 50 yards east…the drone and myself were both facing south at launch time, with wind blowing from the east (blowing in the opposite direction of the flyaway). I was trained to start each flight in ALT HOLD mode, and then switch to LOITER mode after gaining a few feet of altitude…I followed that same procedure this time, and I was attempting to just get the drone a few feet off the ground and have it hover in LOITER mode for the initial test. The drone did not hover much at all and instead took off on a B-line for the bleachers. I first reacted by quickly switching back and forth once between ALT HOLD and loiter to see if I could get the drone to stabilize. I then tried an RTL as a last resort, and by the time I realized that the RTL was not going to work, the drone was nearly colliding with the bleachers. In hindsight, it probably would have been better to roll to the west instead of performing the final RTL…though I’m not sure it would have mattered either way with all the errors.

Everything definitely went to hell well before the second RTL…I am guessing that line 6500 is right around when the crash happened…I think the log probably ran for a few seconds after the crash, before the drone shut down…after the crash, the motors were kicking for a few seconds before eventually stopping a few moments before I made it to the scene of the crash.

The plot below shows a divergence from the blue GPS position and the red EKF position. The GPS is accurate.

GPS HDOP and Sats looks fine until after line 6500. If the crash already occurred then this would be expected.

Below is a plot showing the errors and flight modes. The operator didn’t touch the sticks, other than Throttle after arming the copter. The Yellow line shows his mode changes.

The abrupt attitude changes at line 6500 likely represent the crash.

The red line also shows the vibration spike that likely represents the crash, but it the speed (blue) continued to increase from 12m/s to 20m/s, and the altitude shows a smooth descent from ~15m to ground. So perhaps the aircraft was still flying? We can’t trust any of the data after line 6500 because the GPS starts to fail then, and EKF had already malfunctioned. Either way, the EKF errors started about 4 secs prior to the abrupt changes in attitude and vibration.

This shows more detail of the vibrations

Here is the Log Analysis, I don’t know whether the Compass failure was caused by the crash.

Size (kb) 726.5810546875
No of lines 9121
Duration 0:00:27
Vehicletype ArduCopter
Firmware Version V3.4.6
Firmware Hash e707341b
Hardware Type
Free Mem 0
Skipped Lines 0
Test: Autotune = NA -
Test: Brownout = GOOD -
Test: Compass = FAIL - Large change in mag_field (1069.24%)
Min mag field length (76.58) < recommended (120.00)
Max mag field length (895.44) > recommended (550.00)
Test: Dupe Log Data = GOOD -
Test: Empty = GOOD -
Test: Event/Failsafe = FAIL - ERRs found: FLT_MODE CRASH
Test: GPS = WARN - Min satellites: 5, Max HDop: 2.56
Test: IMU Mismatch = WARN - Check vibration or accelerometer calibration. (Mismatch: 1.35, WARN: 0.75, FAIL: 1.50)
Test: Motor Balance = FAIL - Motor channel averages = [1447, 1518, 1453, 1363, 1501, 1460, 1348, 1469]
Average motor output = 1444
Difference between min and max motor averages = 170
Test: Parameters = GOOD -
Test: PM = NA -
Test: Pitch/Roll = NA -
Test: Thrust = NA -
Test: VCC = UNKNOWN - No CURR log data

I do not understand what caused the EKF errors, but once the EKF_CHECK-2 occurs, the copter will no longer attempt to hold its position and should enter LAND mode. The operator should have been able to fly the copter back into position, but he was too reliant on the automation. I would have expected the copter to drift in the wind, put it appears to have taken off at high speed up wind. The operator reports that the copter started to fly away immediately upon switching to LOITER mode, however the errors don’t show up until about 7 secs later, by which time the copter was already flying away at 10 m/s. Should the EKF_CHECK_THRESH sensitivity by increased for earlier detection?
I’m requesting some assistance in understanding what may have been the root cause or the EKF errors before I try to put this bird back in the air.

Thank you for your time.

2017-06-09 11-48-22.bin (342.4 KB)
2017-06-09 11-48-22.log.param (10.1 KB)

1 Like

I would suspect an issue with multi-path signals causing a GPS variance. Bleachers make a very nice microwave signal reflector. Simply switching to a non-GPS flight mode would’ve likely prevented the crash.

Why then does the GPS HDOP and # of Sats look fine?

Well, GPS is not 100% reliable. The number of satellites “seen” does not tell anything about their position in the sky. If they are all overhead the GDOP will have more error. If some are on the horizon then there is likely signal refraction from the atmosphere, and if some of the signals are reflected off objects (multipath) then the signal takes longer to reach the receiver than it should and it can cause momentary errors of hundreds of mph in speed (if your consumer level receiver can measure that), and hundreds of meters position error, depending on the refraction source of the signal. It won’t show a variance in HDOP until after the fact because the calculation is not instantaneous.

I’m a commerical pilot and we use GPS all the time with great reliability. But we don’t depend on it for precision approaches close the ground where the errors can build. In flight, it’s great. We’re not close to the ground and have little chance of getting multipath signals. But the consumer drone industry has marketed their products using GPS as 100% reliable, and they’re not. Out in the open where there is no buildings etc, it can be pretty reliable at the ground level. But any time you fly in an urban environment with perfectly angled 1.6GHz reflectors like bleachers, or buildings or whatever, it still takes a pilot on the controls because it can fail. And that’s kind of the downfall of the whole industry - these drones have made anybody a pilot, but they don’t know how to fly anything without the automation. It is why one of your major drone manufacturers has products commonly featured in flyaway videos on the internet.

In your case the EKF State Estimator did detect an error, but the aircraft was too close to the ground, too close to objects to crash into with an inexperienced pilot at the controls, and it all happened before the errors were calculated and detected.

The bottom line is that the pilot must always be prepared to take over manually because the automation is not 100% reliable flying in urban envirornments. A flight environment where, traditionally, no responsible RC pilot would even consider flying. The consumer-grade RC radios are even subject to interference flying in that environment due to WiFi signals, EMI from thousands of miles of powerlines, multipath signals bouncing off everything in sight from everthing from GMRS to very powerful ham radio repeaters that can key down at any time. And the consumer grade stuff, under Part 15, shares the frequency spectrum with the much more powerful licensed services - and by law must accept any interference caused by the licensed services.

Sorry to hear about your crash. But the pilot could’ve totally prevented it by immediately switching back to a non-GPS flight mode as soon as it was evident the aircraft was heading to points unknown under GPS conrol. Instead, he or she just kept repeatedly trying things that all depended on the GPS. I saw where Alt Hold must’ve been engaged once for a fraction of second because I saw the glitch in the pitch and roll attitude. But the flight mode switch went right thru and RTL was engaged instead, which still depended on the GPS. And it went downhilll from there. The compass outputs looked fine right up to the point of impact. But the position error was huge and the aircraft was simply trying to get to where the GPS calculations said it should be.

3 Likes

I forgot to comment on this. The EKF land failsafe did trigger. But it was defeated by the pilot frantically flipping the flight mode switch to all the wrong flight modes - all while doing absolutley nothing to actually attempt to fly the aircraft. I’ve had plenty of GPS glitches flying in wide open rural areas where I’ve gotten kicked out of an auto flight with helicopters flying at high speed due to EKF Velocity Variance. Altitude is your friend with any aircraft. When it happens get into acro or Stabilize and get altitude immediately to recover control. Fly it around in a circle and see if the GPS settles down and starts to agree with the EKF State Estimator again. If it don’t, fly it home manually.

2 Likes