EKF2 and EKF3 memory issues

Some further discussion on EKF2 vs EKF3 after today’s dev call. The issue prompting this was the solo running out of memory on the Pixhawk 2 when both EKF2 and EKF3 are enabled.

The current documentation indicates that EKF3 is there in “ride along”. Does this require EK3_Enable=1 to be set in order for it to ride along? If so, that would seem to imply that AHRS_EKF_USE would still be set for 2. So AC is using EKF2 for flight, but EKF3 is running in the background for logging and testing. To do this, you would need to leave EK2_Enable=1 set as well, since EKF2 needs to be enabled for flight.

This presents a problem if memory is running out. Is my understanding of this correct, or am I all messed up? Does the Pixhawk 2.1 have enough memory for both and it’s just the Solo’s Pixhawk 2 and Pixhawk 1 that doesn’t?

You understood it correctly - although I’ll say that I’m not sure memory is the biggest problem (the EKF should give errors if there isn’t enough memory to run it), but CPU will be tasked with more than it can chew. This affects all our current STM32 boards.

In the ride along configuration, what you can do is change the number of cores running in each EKF. You have these EK2_IMU_MASK and EK3_IMU_MASK parameters that define what IMUs have an EKF core created for that version. The default for both is 3, meaning the first and second IMUs get an EKF core - when you enable the EKF3 it means you are running 4 cores, double the default! What you can do is set EK2_IMU_MASK to 1 and EK3_IMU_MASK to 2 and each will only have one core running of different IMUs.

Is there any disadvantage to having EKF2 running on only one one IMU and EKF3 running on only the other IMU? At least I think that’s how I read it. Does that make them unable compare the two IMUs?

@OXINARF
Hi Francisco,

Since there is no DevCall summary available so far, I am asking here:

I have flown EKF3 without issues. This includes ride along with IMU_MASKS set to 3 as well as EKF 3 only. Apart from the high NLon values there were no problems. Since NLon is not a good thing and I guess based on:

what is currently recommended? Not using EKF3 at all until the NLon problem has been solved?

Thanks and cheers,
Thorsten

@Thorsten With that model pixhawk? Were you able to do the accel and compass calibrations with everything enabled?

Worth noting, it is blowing up on the Solo’s stock Pixhawk 2. I do not know what the memory and CPU power is on that compared to the current Pixhawk 2.1. So it may be that what works fine on a 2.1 may not work on a 2. Or not. IDK.

@Pedals2Paddles
I was flying with EKF3 ride along on a PixRacer, a DroPix 2.1 and an AUAV-X2. Calibrations went fine on all.
What is the result/consequence of the memory issue?

Well we’re assuming it’s memory or CPU load. But with EK3_ENABLE=1 and EK2_ENABLE=1, the accel and compass calibrations fail. No messages are given explaining the failure. It just gives up and loses connection to the GS. If I only enable EKF2, everything works flawlessly. This happens regardless of 'AHRS_EKF_USE` being set for 2 or 3.

I will repeat the problem this evening and post a dataflash log here and on Github.

Yes, it won’t compare the two IMUs as that’s done inside the EKF version. So EKF2 only compare its own cores and EKF3 the same. Since you would be running only one core in each there would be nothing to compare.
Another bad point about running the two EKF versions is that each one, separately, try to run their cores in different loops, but the two versions don’t coordinate between them so it will be running the cores at the same time.

Usually there is no problem with the EKF and flights go fine so this shouldn’t be a problem, but I only recommend this configuration if you are going to review the logs and see if both EKFs were logging similar values.

The only recommendation is don’t fly with more than a total of 2 cores - maybe 3 will work too, but that’s not tested. Two cores have been working fine for everyone since 3.4 and, given the CPU capabilities, we should continue that. It also depends on what you have connected to your vehicle; more accessories, more work for the CPU.

@Pedals2Paddles regarding your issues that @Thorsten doesn’t have: let’s not forget you are running this on Solo where several other things are running that people usually don’t have. Please don’t post issues in GitHub unless it is confirmed bugs or feature requests, we already have an enormous list of open issues.

That’s kind of why I wanted to post it here for discussion first. I’d rather not post anything on Github unless some research has shown it’s a problem to address. The issue may simply be that the Solo’s stock hardware can’t handle that kind of load. In which case, it’s just a matter of documentation so users don’t try to do it. And making sure our parameter set for the Solo is set appropriately. But this situation in general will apply to everyone, not just the brave solos.

When 3.5 is pushed out as a production release, we’re going to have a lot of this if it isn’t documented well for the users. I don’t believe it says anywhere not to just enable both, or that you need to change IMU masks to do it. The logical response users will have is “ohh, this is new and better, so I’m going to turn it on now!” without any other changes or considerations. Whether it’s a Solo, a racer, or a heavy list octo.

So I think we need to determine what the best practices and scenarios are, document them in the Wiki, and document them briefly in the MP/Tower parameter descriptions. And make sure the release notes for 3.5 tell the user to go read that before pushing buttons.

Now that I’ve said all this work that “we” need to do, I supposed I should volunteer to do it. I’m not an expert on EKF. But I am happy to take on contributing to the Wiki for a lot of this documentation stuff we’ve been talking about lately. The solo will need a lot, and I’m happy to expand upon that for other aspects like the EKF.

@OXINARF / @Andrew_Tridgell / @rmackay9, How can I get on board for Wiki contributions?

I would say that is illogical. ArduPilot should protect users up until a certain point. We don’t have the man-power to do a software that is flexible - not only for users but developers too - and protects the user from doing all kinds of dumb things. There are a lot of ways to break your vehicle by changing parameters without knowing what you’re doing, the EKF is just another.
We have two EKF versions in 3.4 and I’m not aware of a single person changing it. What we need to do is make sure that in the stable 3.5 version we explain that EKF3 is still in early days without much field test and only people that know what they are doing should be experimenting with it - I think our release notes in the first release candidate weren’t good enough on that point.

It’s perfectly fine to add more documentation to the wiki, anyone can do it. On every page you have an edit button - you’ll be taken to GitHub to edit it and submit a pull request. Have you looked at the wiki repository? There is also a wiki page explaining the editing process: Wiki Editing Guide — ArduPilot documentation

Agreed. What you’ve described is what I had in mind. I’m probably just over-thinking the explanation. Thank you for the direction on the Wiki. Project for some rainy days coming up!

Hello Mike

it is Pixhawk flight controller
thanks for documentation.

still have few questions : What is advantage for EKF running two imu’s cores ? Redundancy ? but as you wrote “Usually there is no problem with the EKF…”