Altitude estimation failure

Hello everyone,

I had an issue a couple of days ago flying in altitude hold mode. The mission was very simple, it was performed above a flat surface using a sonar as range finder pointing down, a cube orange, and the 4.0.5 version. When I was about to land the CH3, throttle, was not making it goes down (around time 100 in the plot below)


Figure1, altitude by barometer and pilot altitude command

The altitude started to slightly increase (yellow line above) regardless the CH3 was completely down (red line above) and in a desperate try we switched to land mode (104 seconds), but the drone kept on climbing. Even switching back to altitude hold mode at about 110 seconds and commanding it to go back did not help. Eventually, at 25 m at about 125 seconds we engaged the kill switch.

We noticed that the cause was that the EKF failed to estimate the altitude, the innovation for the altitude was huge and kept on growing (NKF3.IPD). Actually the drone, the control, thought that it was going down instead of up. NKF1.PD (green plot below) tells that the drone though it was as about -10 m when we switched to land mode; recall that PD is position down so a positive value indicates negative altitude.


Figure2, EK2 first instance altitude estimation and Kalman filter innovation

I have seen this problem before and it usually happens when the accelerometer is affected since they are fused with the barometer for the altitude estimation; we configure EK2_ALT_SOURCE to 0 for not using the range finder in the estimation. The barometer looks good as it can be seen in Figure 1. Even the second instance of the EKF is estimating correctly (NKF6.PD green line in Figure 3) the altitude:


Figure 3, Comparison of altitude estimation between the two instances of the EK2

The vibrations do not look bad (Figure 4), so I do not know what else can affect the accelerometers.


Figure4, vibrations

One theory is temperature but I am not sure. Any comments?
The log can be download from the link below:

Thank you in advance!

1 Like

I found something interesting that may be the cause of this issue, but I would want a second opinion. It is clear from the data that the barometers are working, but the prediction part of the Kalman filter, strapdown navigation filter, is not. This explains why the innovation in PD is huge considering that I do not have GPS. After checking the equations, of the Kalman filter, from here https://github.com/priseborough/InertialNav and from the source code, it is clear for me that the position in Z is calculated performing a trapezoidal integration of the speed in Z. The speed in Z is calculated using the accelerometers, the gravity vector, and the rotation matrix which is derived from the quaternions:

image

The curious thing is that the accelerometers or the two IMUs used in the EKF2 look the same at the moment everything starts drifting which makes me think they are not the issue:

However, the quaternions do not look the same and the Euler angles neither. This is because I do not have a magnetometer enabled for correcting the yaw. Also, I noticed the drifting starts after a jump in the yaw of the primary EK2 whic is the one that failed:

This makes the two instances of EK2s behave differently and it may be introducing some error to the vertical velocity through the rotation matrix, and subsequently to the vertical position which is calculated performing a trapezoidal integration.

Does it make sense that the altitude estimation is affected by a bad rotation matrix? I know it is not a common problem since we have flown these drones during months without seeing this issue.

This is indeed really weird. I can’t help you with the rotation matrix.

It seems like your EKF don’t care about the Baro, have you tried modifying the parameters of the EKF ? Maybe the wheight of the Baro ( EKF_ALT_NOISE) is too low or the accelerometer ( EKF_ACC_PNOISE) too high for some reason.

Also your NKF2.AZbias is increasing only on EKF1 and not EKF2. I don’t know what that means because it is also increasing on my copter. But I’ve found several other posts similar to yours while looking for a AZbias problem ( 1, 2, old ).

Sorry, I can’t help you much further than that.

Thanks for the report. I suspect it does have something to do with all the compasses being disabled but I don’t see why this should affect the altitude. I think we need to ask @priseborough if he has any idea.

1 Like

Thank you @rmackay9 and @hubble14567 for your comments! I agree in that the barometer is not able to be used as a correction, this explains the high innovation in the altitude (NKF3.IPD). The innovation, and correct me if I am wrong, is the difference between the predicted part, code pasted in the previous message, and the sensed which must be the barometer. The innovation multiplied by the Kalman gain is substracted to the predicted altitude according to the Extended Kalman filter theory (A great course on the topic) to get the corrected altitude.

The big difference between the instance of EK2 that fails and the one that does not is that the quaternions are different since the magnetometers are not helping to correct them, and these are used for performing a rotation that is used in the prediciton of the velocity. I noticed that indeed the prediction of the velocity in the first instance is indeed drifting and at a certain point the drift is that bad that the sign is inverted. As it can be seen below, it happens after engaging the landing mode:

Subsequently, if the prediction of the velocity is wrong and there is nothing for correcting it, since I do not have GPS, the correction will be wrong too. Subsequently, the altitude prediction will be wrong since it is just the trapezoidal integration of the speed. Apparently the error grows so quickly that the Kalman gain multiplied by the innovation is not able to counteract the mismatch between prediction and correction of altitude.

This is another log with a similar issue, though on this one both instances of EK2 failed:

Pretty much the same, when the yaw is drifting badly, the innovation starts to accumulate:

The GPS is not needed by the EKF to estimate the speed, it uses both the accelerometer and the baro to calculate the velocity. I flew quite a while with and without GPS and looked pretty closely at VD (speed down) and PD (position down) for a problem I had in AltHold. In the end, PD and VD were working fine without GPS speed.

In my case the GPS readings were wrong, but it only impacted VD and NOT PD. This doesn’t seem to be your case at all because your PD is wrong.

If you think it, somehow, comes from the GPS that you have not, try EK2_GPS_TYPE =1. This will tell the EKF to not use the GPS for vertical speed.

I can’t help you with the quaternion stuff but it could be the issue.

@hubble14567, the parameter EK2_GPS_TYPE was set up to 3 which will disable it completely. We mainly do this because we fly indoors. Certainly the velocity in Z is predicted, but it is not corrected since there is not any direct measurement for the velocity. With a GPS, we would have a Z speed coming from it directly. What I refer with correction of the velocity is from the EKF algorithm that has two parts: prediction and correction.In this case, the prediction of the speed is done using the accelerometers, the gravity vector, a differential of time since the last iteration, and a rotation matrix. The rotation matrix comes from the orientation that is represented by the quaternions which of course are converted to Euler angles as you can see in the logs.

My point is that something there in the prediction of the velocity and the prediction of the altitude is going wrong; the prediction of the altitude is the numerical integration of the speed in Z. Not seeing any error in the accelerometer comparing the two EK2 instances, and considering the limitations of not having compass for the correction of the orientation, I wonder if this is the problem for this particular case. This is not a common problem, may be something that we have seen once every 200 flights, but when it occurs, the drone flies away and not even the landing mode works since it requires a good altitude and a good Z speed.

Yeah I understood that, sorry to have made you repeat. I’ve never seen the EKF closely, I simply had issue with it before. I’ve zero idea of what’s the issue now. @priseborough seems to be the knowledgeable guy on the subject.

1 Like

@hubble14567 don’t worry, I appreciate your comments. May be I am wrong with what I think is happenening. I would like to try if changing the weight of the barometer as you suggested helps, but the difficult thing is to know if it solves the issue. Part of my frustration is that I am not able to replicate the exact conditions when this happens. @priseborough looks like the right person for answering some of this questions.

I’ve also experienced the same issue to @sbaccam. @priseborough I’m very interested to hear what you think and if there are any potential solutions.

Hello everyone,

@rmackay9, @hubble14567, @a_ashur and @priseborough, I may have found something that may have been part of the issue though it will not solve it in all scenarios. Checking at the source code, the Copter-4.0 branch and the master branch, I found a reason why the core selection between EK2 instances may have not worked in the first log I shared (log_45_UnknownDate.bin), even if the innovation of the altitude was significantly higher in the primary instance.

In the EK2 code a score is calculated for the primary instance on this line. That score should have been huge in my case since it takes into account the hgtTestRatio which is proportional to the innovation of the height estimation:

image

However, the score is never calculated and it is always 0.0f as it is shown above since the yawAlignComplete may not have been true since I do not have an alignment source because the compasses are disabled.

Thus, the logic for selecting a different core should may not have been run. This logic only runs if this condition is met. The EK2 can still be considered healthy which is the other part of this condition since the healthy method requires position, velocity, and height to be unhealthy as shown below:

The other thing is that the data from barometer is not going to be fused since it is very different from the IMU data, so hgtHealth will be false and the barometer is not going to help after certain point to the estimation of altitude:

In conclusion, and please correct me if I am wrong, the EK2 did not switch to a healthier instance in the log_45, because:

  1. The error score is always zero if there is no yaw alignment, and there is no since there are no compasses.
  2. The EK2 needs the three velTestRatio > 1 && posTestRatio > 1 && hgtTestRatio> 1 for being unhelathy, and it looks like at least one was not higher than 1.
  3. The barometer stopped being fused after the difference between the altitude of the barometer and the altitude from the prediction part of the Kalman filter reached certain point. This only led to more error accumulation since in this case actually the barometer is more reliable than the IMU and it would have helped to avoid the bad estimation.

It seems to me that the redundancy will only work if we have a yaw correction source, and I wonder if there is a way to keep on fusing the barometer after the innovation has reached a certain point. We have found that though noisy, the barometer is more reliable that the prediction of altitude integrating the velocity.

3 Likes

@sbaccam,

This seems like really good analysis.

Hello everyone !!, @rmackay9 @sbaccam @priseborough, i think i have quite a similar issue as @sbaccam is facing, i can see above that he is trying to decode what’s actually going inside EKF, that’s why i think its better to share my problem here than starting a new thread cause i can see same patterns for this problem in my case.

Build Setup:

  • 5 inch quad, 350 grams weight with 2S Li-ion battery
  • Omnibusf4pro flight controller
  • Running ArduCopter v4.0.4 with slight modifications ( backported 2 PRs from master )
  • Using M8P RTK GPS
  • Using EKF3
  • Using GPS as EK3_ALT_SOURCE

I am actually trying to reproduce this issue, i can share some crash logs on different quadcopters too which i faced some days ago, i don’t think crash logs motivate developers, so i am trying to reproduce the issue this time. i have multiple quadcopters with same build setup as mentioned above, this issue arises time to time on different quadcopters.

Testing Setup:
Reproducing this issue is very simple, i kept multiple quadcopters powered-on for long periods in open space on the ground, but keep rebooting them after some time intervals in order to reproduce this issue, it does look like this issue arises with time. But this issue does not arise everytime, but can be seen in 2 quadcopters out of 6 that i was bench testing for longer durations.

I was rebooting multiple quadcopters after every 15-20 minutes interval, i am rebooting using hal.scheduler->reboot(false) upon a RC switch and not manually, so i am just doing only soft reboot and not removing the battery for the whole duration of bench testing.

Logs:
i am sharing 3 consecutive logs of same quadcopter on which this issue arises, but same issue can be seen only in first log, everything seems alright in rest of the logs, i can share many more logs too if anyone request.

00000013.BIN
In the first log, this misbehavior can be observed after around 12 minutes of soft reboot. This quadcopter was already powered-on for 30 minutes before starting of this log after soft reboot.

00000014.BIN
00000015.BIN
These next 2 logs are just consecutive logs of same quadcopter after next soft reboots, where everything seems alright in log related to EKF estimations and innovations. I am guessing from these 2 logs is that this issue might not related to time and temperature, as for last log, quadcopter was making log for around 45 minutes.

Folder with above logs

Observations:
In this context, i might be wrong with some observations, like i can see EKF velocity and position estimations and innovations getting disturbed when this issue start to arise, but cannot find the source of this problem cause raw sensor data of baro, IMU and GPS seems to be alright throughout the time before and after this issue start arising. I don’t understand why EKF position and altitude estimations have a direct relationship with varitations in XKF3.IPN,IPE,IPD in the first place. I also have harmonic notch filter activated on this particular quadcopter as been suggested by @andyp1per for smaller quadcopters in my older post. I am just predicting right now that there might be issue only from EKF estimations end and this issue is not related with any hardware, sensor or temperature cause i have seen this issue time to time on different builds and with different flight controllers which running only 1 IMU.

For now, I will simply going to implement a custom pre-arm check according to the values of XKF3.IPN,E and D to avoid any mishaps in future, but I really want to solve this issue, I will really like to workout on possible solutions for this issue and try test cases upon any suggestions from developers and other people.

1 Like

@sbaccam @a_ashur @hubble14567 checkout this guys, seems related to our issues as well

i can approve from my end that this PR has solved my issue