Drone keeps on climbing in guided mode despite asking for descent

We are operating large outdoor drone swarms with ArduCopter, using guided mode with an onboard computer. Every now and then, we are running into an issue where one of the drones keeps on climbing despite the logs confirming that we are sending downward pointing velocity vectors in guided mode. Switching back from guided mode to landing I have spent quite a bit of time today trying to analyze the logs and figure out what goes wrong, but I’d appreciate some help.

An example log is posted here: https://plot.dron.ee/AOD5

The best way to visualize the problem is to plot NKF1.VD vs AHR2.Alt - note that AHR2.Alt is consistently increasing when the EKF “thinks” that it is descending with 1 m/s (which is what we asked for). Increasing the magnitude of the guided mode velocity vector to 2 m/s makes the drone descend slightly, but not nearly with the desired speed. GUID.vZ and NKF1.VD move together as if the EKF is “thinking” that it is following the instructions from the guided mode accurately. NKF1.VD is clearly wrong - using the estimated Z velocity from the GPS only (say, GPS.VZ) is much closer to reality.

The logs also show quite a bit of vibration on the Z axis (VIBE.VibeZ), but this issue is happening randomly (not consistently), and not always with the same drone, so I’m not sure whether the vibration is to blame here. It happens roughly once every 150-200 flights (but we are flying with multiple drones at the same time).

I have also noticed a consistent offset between the AccZ measurements of IMU and IMU2, but the bias seems to be compensated in the EKF (judging from NKF2.AZBias), so I’m not sure whether that’s something to worry about.

(I’ll try to attach a plot as soon as I’ve figured out how).

Firmware version: Copter 3.6.7

Thanks in advance for all your help!

Update: so here’s the plot:

Guided mode was turned off some time between 19:55 and 19:56. The section where we experimented with 2 m/sec descent instead of 1 m/sec is at 19:53.

It seems that IMU1 Acc does not initialize correctly or faulty. Even on the ground it shows -8.5m/s/s/.
With enabled prearm checks (they are there for a very good reason) you will have inconsistent accelerometers prearm failure which prevents flying with a faulty sensor.

I don; think that EKF compensate for that high bias. If you check PD innovations they are constantly high during the flight.

Did you compared flight log of a successfull flight from the same copter ? Does IMU1.AccZ have this error during that flight ?

Thanks for the response and good point about the prearm checks - I’ll ask around why we have them disabled. I know that there’s another prearm check layer in our companion computer but it probably doesn’t check everything.

Anyway, I have checked a few other logs from the same drone earlier this afternoon; the bias and the difference were still there consistently but otherwise it flew just fine. (I’ll check the PD innovations in a normal and a faulty flight later tomorrow).

Could it be the case that it’s mostly using the IMU2 Acc in these cases? Is there a parameter that would tell which accelerometer the EKF is using or how it is weighing them? The documentation seems to refer to a parameter named “Ratio” in the EKF2 log message (here: http://ardupilot.org/dev/docs/extended-kalman-filter.html ), but as far as I know the EKF1-4 log messages were superseded by the NKF1-10 messages in some earlier version and I can’t seem to find anything similar in the NKF messages.

By default, code always starts by using the first EKF lane. Which is tied to IMU1. It seems that the innovations were not high enough to cause a lane switch (No message about it in the log).
Used lane is in NKF4.PI (Primary Index)

I’ll try an accelerometer calibration tomorrow and see if the offset goes away, but I’ll mark this issue solved for the time being as this seems to be a plausible explanation.

Thanks a lot for your help!