Copter EKF altitude estimation error due to heavy uart communication load

It’s based on nuttx. Could ChibiOS be a good alternative? I heard that it is not stable as nuttx

It’s ArduCopter V3.6.6-rc1. I’ll check my code again thanks.

If it is nutTX then it is understandable… change to ChibiOS. you will be surprised.
Regarding ChibiOS, it is stable and NutTX even got removed from 3.7… so change change change.

You should not be able to download dataflash logs in-flight. Are you
able to?

  1. The dataflash download lag occurs even if it is not in-flight.
  2. Also while in-flight, I used function gcs().send_text in usercode to check gps delta delay
    My point was that in-flight gps delay occurs due to dataflash log function ‘inside’ pixhawk. Did not download dataflash logs in-flight…

Took about two days to build cause wiki was outdated. I’ve posted the issues and my solutions Cygwin /eclipse build issues

I’m already surprised by its fast build speed… but kinda disappointed with not supporting lidar lite v3 pwm mode thus I have to change lines to i2c. Anyway thank you for the suggestion.

Do you have a branch somewhere I could test with?

I only use local repository so I do not have branches. Since I’ve decided to move on to ChibiOS, nuttX is abandoned for a while. Anyway thank you for your concern :slight_smile:

is the multrotor with this problem equipped a camera gimbal? maybe a Storm32 control board?

Any additional information here? I have been noticing similar behavior during takeoff. The vertical EKF estimation seems to have larger than normal errors just as the vehicle is armed. The strange thing is that it can not be replicated in SITL simulation so your theory on serial overhead is reasonable. Has anyone else seen this behavior?

So after some additional investigation I have found that on the first arming after a power cycle/pixhawk reboot, there is a very consistent 0.5 m/s vertical velocity error which leads to a position error of between 1-1.5 meters which is not reflected in barometer altitude but is present in fused altitude output. This is incredibly repeatable and unavoidable regardless of uart comms load. On all successive arming events, the vertical velocity error is smaller but still present. Interestingly enough, if I turn off all serial interfaces other than gps port, these successive vertical velocity errors on arming are smaller/non-existent.

I have not had a chance to mess with the dataflash logging since I need it in order to analyze results, which leads me to wonder how you can tell if it is better after turning off dataflash logging if you cant verify through log analysis.

I will post some log images when I get a chance but it is really strange and appears to be independent of nuttx vs chibios. This seems to be present on either firmware version.

I am experimenting with the same fw version. 3.6.6

Probably I’m having the same issue.
Have you reported this on github?

no camera gimbals are used. It’s based on pixhawk2 cube and copter 3.6.6-rc1

I checked the gps delta time by using console mavlink when the dataflash log was turned off. There were no ekf divergence and also five times of test flight showed no drop-down due to sudden ekf altitude divergence when the log was off. Of course this is trial-and-error so it may not be sufficient to prove the non-existence of the wrong estimation.

Also as you have said this appears to be independent of nuttx vs chibios. While I was working on ChibiOS, the same ekf altitude divergence problem occurred. This time the overhead was not from uart communication but the sd card read write problem, even though the usercode is almost same. I’ve found out that too much repetition of open function (POSIX based file read write function) due to read function(also POSIX) fail causes overhead. By using open function only once and disregarding the failed read data in that particular sample did prevent computational costs. The experiment flight again had no drop-down issues. I’m guessing that ChibiOS somehow messes up with sd card dataflash logging file read write since the it does not use proper POSIX library.(I’ve had no file read function failures in nuttX).

I’m not expert on RTOS thus benchmark testing cannot be done. I can only inspect the results and guess about it, but still it’s very fishy that overhead cause by either too much communication(nuttX uart communication) or high computational load(ChibiOS file I/O related functions) leads to the ekf divergence.

If you are also using usercode, why not try to find the main cause of the high computational cost? I’m pretty sure that the stable version with no usercode will perform nicely, if hardwares and parameters are set correctly.

nope, but if others have same issues repeatedly, the issue should be reported.

That makes a lot of sense. Dataflash logging would explain the issues with the first arming event after a power cycle since that is the point at which the log file is created. Also since I cant replicate these vertical velocity and position errors that occur on arming in a SITL simulation, something hardware related seems likely.

Seems like our symptoms are very different since I don’t experience long term divergence, just large instantaneous errors in vertical position and velocity on arming. I will try to get more information in my testing today.

Just as an added note, I did test a theory in which I placed a hal.scheduler->delay_microseconds(n_ms) in the arming code copter:: init_arm_motors() just before the function returns to see it increases the vertical errors, and it does. The longer the delay, the greater the vertical velocity and position errors produced on arming.

I think there is some new delay just after ekf reset that causes drift in vertical position and velocity. This did not happen when I used a 3.5.5 version of arducopter.

Also just for info, below I have attached an image of the signature I see on the first arming event of every power cycle

I did not fly, just armed the vehicle. This happens every time I arm after powering off the vehicle. Very repeatable and consistent in magnitudes.

2019-05-06 12-30-26.log (370.2 KB)
Just following up with a log of an arming event that results in vertical pos and vel errors. It is just a single arming with no flight in case anyone wants to investigate

Sorry for the 3 posts in a row, but just wanted to confirm that the signature I posted in image form above is related to the start of dataflash logging. If I log while disarmed and run the same test, that signature still exists just as the log begins. I will attach the same arming only log gathered while log while disarmed is enabled. you can see that arming still exhibits a much smaller error in vertical velocity and position.


arming takes place at about 00:30 on the time axis. you can see that there is still a small error in vertical pos and vel but significantly smaller than when the log starts at arming. The signature seen here will also get smaller/non-existent if all telem/serial ports are disabled (except gps and usb ports). But you can see that the same 1.5 meter vertical pos error exists at the point where the log starts. It also seems to get larger as parameters are downloaded upon connecting to mission planner(between 58:00 and 59:00). Once again, this cannot be replicated in SITL. There is definitely some strangeness related to uart load and dataflash adversely affecting the vertical ekf estimates.

I have attempted to attach the log but it is too large to include in this post.

Have you tried the stable firmware without any modifications and got same result? From your log I can’t find the reasons. Your accel z and barometer seems to be fine, only the innovation tells you that EKF height is wrong. How about changing ek2_gps_type to 1, which does not use 3d gps velocity? If it’s not sensor issue, I suspect that the reason will be related to EKF fusion time problem caused by the delay I do not know.

I confirm that I can see this “altitude estimation climb on arming” even in telemetry without enabling logs.

This begins before props start spinning.