I have a situation that happened in flight and I cannot find any structural or mechanical problem
I’m using a VTOL Y3 Nimbus 1800 using EKF3 and after a waypoint I see one of the gyros scaling up values and Pixhawk 1 did nothing to regain control
Did one of the gyros offset and EKF3 does not rescale it? do you see other motive for inactivity of the controller to solve the issue in flight?
After Waypoint 9 IMU2 increases values in X axis and Y axis and crash happens after Waypoint 11
Can you please help with this forensics analysis to learn what really happened?
you can see that the 2nd IMU diverges suddenly from the first by about 40 degrees/second. That must be a failure in the L3GD20 IMU. You were not using the 2nd EKF lane (which uses the 2nd IMU) in this flight, but it turns out to be an important clue. Note that 40 degrees/second if far too large an error for EKF lane two to cope with by learning a bias.
The next key failure is this one:
this is the health status of the first IMU (accel and gyro). You can see that is suddenly goes unhealthy.
At that point the first EKF lane (which normally uses the first IMU) switched to the 2nd IMU. It does this when the IMU it wants to use is unhealthy. So at this point both of your EKF lanes are using the bad gyro which has a large offset.
It manages to stay in the air for a while as the first EKF lane has learnt the “inactive gyro bias” of the 2nd IMU, but the gyro is not good enough for it to fly for long.
I am running out of time for this analysis before the weekly dev call but will write more later. Basically you had a double IMU failure, but I think ArduPilot could have handled the failures better.
More later.
Thanks for the analysis. looking forward for the conclusions.
Do you have info to find out if double IMU failure was hardware sensor, communication line or software fail? Anything can be done in software to prevent this failure in future?
sorry for not coming back to this earlier.
I’ve looked into the invensense driver which is used for your primary MPU6000 gyro. I’ve concluded that the following test must have fired:
that check is to see if any of the critical configuration registers have changed in flight. The config registers of an IMU control it’s scaling, filtering etc. When they change while flying it means something is badly wrong with the IMU, so we mark the IMU as unhealthy. An unhealthy IMU will only be used if there are no other healthy IMUs.
So I think you had a lot of noise on the SPI bus used for both gyros. That noise caused the primary IMU to get a bad transfer and incorrectly mark the IMU as unhealthy.
I’ve opened a PR to try to address the issue:
ultimately it was a hardware failure in your flight controller, and it needs to be replaced, but this change will make it a bit more likely the next person to see this error will avoid a crash