"EKF Primary changed" cause copter crashed

Hi everyone.
I took off the drone by Stabilize mode and switched to AltHold mode. After a few seconds, the message “EKF3 lane switch 1” appeared and the quadcopter tumbled in the air and fell.
the flight controller is a cheap Pixhawk 1.
when I was analyzing the log, I saw a change in EKF1 that makes the copter unstable. In fact, when the autopilot switches from lane 0 of EKF1(or XKF1[0] in the log) to lane 1 of EKF1(or XKF1[1]) because of bad GPS data, the copter becomes unstable and got crashed.
The figure below shows the value of the pitch angle for two lanes of EKF3. It can be seen that the second lane of EKF had zero values from the beginning of the test.

By checking the IMU data, I realized that the 2nd IMU (IMU[1].GyrX,Y,Z) had zero values since the beginning of the flight, which is clear in the figure below:

It looks like the second EKF’s attitude estimate went bad because the 2nd IMU became unhealthy. But we can see the IMU.GH and AH (gyro health and accelerometer health) did not change to zero.
I would really like to know if this is a bug in EKF3 affinity switching or a consequence of using a cheap flight controller.
Now the question is, why didn’t the autopilot give a pre-arm before the flight that the gyroscope was unhealthy?
Or why did the autopilot not realize that the second lane of EKF3 had a problem?
As I know EKF3 affinity use lane-switching to select the lane which has the best-performing combination of sensors. But why was the best lane not chosen?
Your support in this discussion is greatly appreciated.
Thanks!

The dataflash log for this flight is here: https://drive.google.com/file/d/1G4DvydQEjGS96zOHJaS0EiUXv5xB-cx8/view?usp=sharing

@rmackay9
Do you have any idea about this?
Thank you in advance.

Usually it is the other way around: once the copter crashes the EKF lane changes.

You should check your vibration levels!

You’ve got ARMING_CHECK,24 so that might affect reporting of the gyro, unsure, but I’d tend to enable ALL arming checks.
Try setting INS_ENABLE_MASK,3 to see if that has any effect on the calibration and use of the second IMU
I’ll check more later…

Just generally speaking, you can set
INS_ACCEL_FILTER,10

I would set LOG_DISARMED,1 and move the aircraft by hand and see if any IMU1 gyro data is logged, and compare it to IMU0 gyro.
It looks like position variance caused the lane switch, but selected the bad IMU.

Try changing
GPS_GNSS_MODE to 5 or 65 since it’s being a bit overwhelmed and update rate is affected.

Also update to AC 4.2.3 (at least)

Seriously - I’d consider changing the flight controller unless you can confirm it will operate reliably with both IMUs.

Hi Amilcar Lucas. Thanks for your response.
The vibration levels was low before the “EKF3 lane switch 1”. The figure below shows this problem.

But I can’t see the vibration levels of the 2nd IMU(VIBE[1] has no value). How can I activate it?

Can you please help me with this?

Hello Shawn. Thanks for your attention.
I think INS checks (i.e. Accelerometer and Gyro checks) were enabled and this doesn’t cause any problem.
INS_ACCEL_FILTER = 20 is the default value. Is the recommended value for this parameter equal to 10?

Yes, recommended value for INS_ACCEL_FILTER has changed to 10.
The default value needs to be updated. 10 is safe to use and there shouldn’t be any adverse effects, in fact probably a slight benefit.

1 Like

I tried this and got different results from each test.
Sometimes the 2nd IMU has zero value like this:
https://drive.google.com/file/d/1TMryu4DMpQNHFTGfnvtsuWpefcgjzh-N/view?usp=sharing
https://drive.google.com/file/d/1bPyEOtPemo7d4ozv2FRkxzA4bWjIc1cR/view?usp=sharing

Sometimes there is a large offset between the two IMU:
https://drive.google.com/file/d/1TFusc-nEWWmLwXF65vQ-K6DosY4mNK-b/view?usp=sharing

Sometimes they work correctly.
I noticed that when I turn on Pixhawk for the first time and the sensors have a low temperature, the second IMU shows a wrong value, but after a few reboots, this problem is partially solved and only the offset is still there!

Thanks for clarifying.

I don’t understand why position variance caused the lane switch. Both lanes use the same GPS.
Why isn’t the data of the parallel lane checked when changing lanes to avoid these issues?
Why IMU.GH and AH didn’t change to zero?

Time to replace the “cheap Pixhawk” (as you called it) with a branded Flight Controller.

2 Likes

This is the way to go!
Even disabling the second IMU would still leave doubts in my mind - which part will fail next?

1 Like

Thank you for your time and explanation.
But I still have questions.
I read this page EKF3 Affinity and Lane Switching — Dev documentation but I don’t understand why position variance caused the lane switch. Both lanes use the same GPS.
Why isn’t the data of the parallel lane checked when changing lanes to avoid these issues?
Why IMU.GH and AH didn’t change to zero?

I advise you to send that “creative” flight controller hardware to tridge so that he can change the FW to be more resilient.
And forget about it, just buy a new FC.

I will buy a new FC but how to avoid these problems?
The user should have been warned by the GCS if the backup EKF lane is unhealthy.

Objectively, which handphone or consumer electronics (I think premium coffee maker machine, expensive printer, yes) manufacturer heads up the user time to forget “him” and move on, except Apple “purposely” make you feel it.

So, in my view, if the feature is not available and if anyone like to generate money out from it, then one can potentially explore expand through the onboard companion board to do a hardware diagnostics and light up a LED.

The hardware is defective, it was not premium hardware.
What non-premium handphone or consumer electronics is reparable nowadays?

Can you replace a defective chip yourself? Probably not.

It is not about the repair.
The problem is actually to diagnose the fault and prevent unwanted consequences.
The user should have been warned by the GCS if the backup EKF lane is unhealthy. Or before switching to other lanes we should look for an alternative healthy EKF lane with lower error to switch to. Is this not possible?