Takeoff overshoot issue (3.6-rc10)

moobsen · September 27, 2018, 12:13pm

Hi,
we are running 3.6-rc10 on a PixRacer build (the default NuttX firmware version, nothing custom built), with an TFmini LiDar. This is controlled by an onboard computer running dronekit. The onboard computer arms the drone, does a takeoff, and then flies to some waypoints, basically making the drone do a survey. This works quite well… usually.

Sometimes the drone does not takeoff normally, but instead spins up the rotors very slowly. Until it reaches the liftoff point, in which case it shoots up into the air, way above the altitude I specified with dronekit-python. Below is a screenshot of the log for a normal takeoff, and a overshoot takeoff.

Overshoot

Normal

What we observed is that the CTUN.Alt starts increasing, without any sensor actually reporting an increase in altitude. I have not looked at the code yet, but expect this to be related to the observed behavior.

Has anyone a good explanation for this? Is this a bug, or are we doing something wrong?

What I can say for sure is that the altitude I set in dronekit-python seems to be “interpreted freely” at best by the drone. But I think this is an unrelated issue in dronekit. In the long run we will probably have to switch to mavproxy for scripted drone control.

Any help is appreciated.

Full logs:
Overshoot: 18-09-25_17-32-24.bin (248.5 KB)
Normal: 00000090.zip (576.3 KB)

Eosbandi · September 27, 2018, 6:27pm

Those EKF failures right after arming makes me cringey…

moobsen · September 27, 2018, 8:10pm

But what could be causing it? As far as I can tell, there was no difference. It was started in the exact same spot, same conditions. 9 out of 10 times it works, sometimes this happens, for no apparent reason.

Anubis · September 27, 2018, 9:02pm

From what I can tell, your accelerometers might need to be recalibrated. The drone thinks it’s ascending while it’s just sitting on the ground, although no altitude sensors indicate that. This leaves the IMUs as the culprit, which are heavily relied on to detect climb rate.

As for the difference between the two flights, I see that there is no difference at all, as far as the sensor and EKF health goes. The difference is how long the drone was sitting on the ground before launching. In both cases, it thinks it is slowly ascending, but in the overshoot case, it was sitting around for much longer - long enough that the EKF altitude consistency checks failed, causing the EKF switching, and long enough for the desired and measured altitude to invert, making the drone want to descend instead of ascend.

Anubis · September 27, 2018, 9:17pm

There’s also a parameter EK2_ALT_M_NSE which lets you change the relative weighting of barometer vs. accelerometers for altitude fusion. You probably shouldn’t need to change this unless you have some special case where your accels are unreliable.

xfacta · September 28, 2018, 8:36pm

Does the flight controller do the accel calibration, cycling red/blue LED, just after power-up?
I was wondering if Boat mode was set but couldnt tell from the parameters, I dont know enough about it yet, one day soon I’ll go through the source code to educate myself.

moobsen · September 28, 2018, 9:29pm

Hm, I hope I did not mess up with the logs.
But the behavior is definitely not the same. In one case the drone instantly takes off in a controlled manner, as if pulled up by a string. In the other case the motors start slowly and I believe the drone “thinks” it is flying, but it still on the ground. Then at some point it shoots into the sky. If it doesn’t crash while doing to.

This might very well be related to the IMUs, thank you for the hint, I will look into this. Is there an easy way to tell what triggered the EKF switching?

moobsen · September 28, 2018, 9:32pm

I am very sure that it is in Quad X mode. But I did not know the red/blue blinking at the beginning is the accel calibration.
For flights described above I actually do not know what the LEDs did, I only have the log files.

xfacta · September 28, 2018, 10:30pm

Boat mode is in Copter, it prevents the accel calibration from running at boot-up for when you’ve got a moving platform.
If you power up and you’re still messing around with the craft touching it while the accel calibration runs then best to power off and on again.

Anubis · September 28, 2018, 11:02pm

What I meant by “I see no difference” is that I saw that the same problem exists in both logs, it’s just that it doesn’t always manifest into a symptom. So, in theory, you should be able to reproduce this problem every time if you wait long enough.

moobsen · September 28, 2018, 11:47pm

When the drone is turned on, the companion computer turns on as well. This combined setup will usually take about a minute to “boot”, then it takes takes off. Therefore I would expect this symptom to either always show, or never show, if it was caused by IMU/accel drift. Or do you think 3 seconds can make a big difference?

As Shawn pointed out, it is very possible that the drone was not completely still in the accel calibration phase. I wasn’t aware of the precise moment this occurred, so I’ll try to reproduce it by moving/shaking the vehicle in the blue/red blinking phase and experimenting with boat mode.

Anubis · September 29, 2018, 12:39am

ArduCopter waits until the accelerometers are “still” before it performs the accel cal, so it’s pretty hard to get it to calibrate when it’s moving (that’s why boat mode was implemented).

You’re right - the time frame is probably not so important, but if you compare CTUN.Alt and SAlt, you will see that Alt is unstable before takeoff in both cases. The bad flight’s altitude was just changing faster, causing both EKF’s altitude consistency checks (NKF4.SH and NKF9.SH) to exceed 1, making the EKF switch lanes. But this is really just a symptom.

The cause is climb rate. ArduCopter relies heavily on the accelerometers to determine climb rate, and in altitude-controlled modes, it’s actually climb rate that is controlled. See this graph of the bad flight:

While the drone was sitting on the ground, it thought it was climbing. When it tried to take off, it set a desired altitude and a desired climb rate. The desired rate was higher than the measured rate, so it spun the motors up a bit to match them. However, as it “climbed,” the desired climb rate slowly decreased as the altitude began to approach the altitude target so that the drone would reach the target without overshooting. It therefore started to spin down the motors to slow its ascent. Unfortunately, it never spun them up enough to actually start flying (it got to ~12% throttle, and it looks like you hover at around 25%). Of course, it continued to “climb” even at 0% throttle, overshooting the altitude target. So the drone just sits there at 0% throttle, thinking that it’s flying off into space and trying to descend.

Eventually, the EKF realizes that it still has not satisfied its altitude consistency check, so it panics and resets. This sets its altitude to match the barometer/lidar fusion (basically 0), but does not change the altitude target, so it immediately sets a positive climb rate to reach it. That’s why the drone suddenly jumped into the air, although I’ll say it did not handle this transition very gracefully - it set throttle to 100%. Not sure if that’s intended.

xfacta · September 29, 2018, 8:58pm

I looked in the doco _ I was saying accel calibration, but really I meant gyro calibration !

http://ardupilot.org/copter/docs/boat-mode.html#boat-mode

Anyway, it doesnt look like you’ve got boat mode set.
What happens if the onboard computer is taken out of the loop? And use standard RC as a test? Just to make sure the issue is not coming from an external source.

rmackay9 · October 1, 2018, 4:38am

As @xfacta says, there’s some confusion above re accelerometer calibration and gyro calibration. So the blue-red flashing lights soon after startup are the gyro and barometer calibration. Accelerometers are only calibrated when requested by the user. I think the issue is probably not related to the accel calibration.

I think there are a couple of possible causes of the problem:

the EKF reset is affecting the altitude estimate and we are not properly handling this (i.e. an AP code bug)
a bad sequence of requests from the companion computer.

My guess is that it can at least be avoided by changing the requests sent from the companion computer or GCS. I think the companion computer/GCS is requesting the vehicle fly to particular location before the vehicle has physically gotten off the ground. If the position request is sent a few more seconds after the takeoff command then I think it will work OK.

If the issue is (1) then I think it could be avoided by turning of EKF lane switching. The EK2_IMU_MASK can be used to disable the 2nd core, details are here on the wiki.

moobsen · October 1, 2018, 11:21am

When I fly manually (in Stabilize) everything works as expected.
I haven’t yet seen this behavior when using a GCS, but then again I have not really tested it much, as this is not the intended purpose for the drone.

I am currently working on switching over from dronekit-python to mavproxy on the companion computer, so maybe this will already take care of the issue.

Anubis · October 1, 2018, 6:43pm

Have you tried hand-flying it in Loiter/PosHold or Alt hold? Companion computer aside, both EKF’s were confused and they both reset, so turning off lane switching wouldn’t have changed the outcome. The incorrect climb rate estimation remains. Flying in an altitude-controlled mode would allow you to reproduce the issue without the companion computer.

But I strongly suspect that it will go away if you recalibrate your accels. It will be easy to test safely by using disarmed logging - you can just have your drone sit on the ground for a bit before and after calibration to compare altitude/climb rate estimates.

goredhawk · December 13, 2018, 5:45am

Hi，the log viewer you used looks graceful, what is the name?

moobsen · December 13, 2018, 11:03am

Hi,

If you mean me, I’m using apmplanner2 (on Xubuntu). It’s quite good for log viewing.

http://ardupilot.org/planner2/

Rick is using MissionPlanner, I believe.

Cheers