This blog is an introduction to my GSoC Project. I will be working under the mentorship of @tridge and @priseborough. My first Open-Source experience and am looking forward to a great learning and contributing summer.
An Unhealthy EKF standing at the Fork of Lanes in the Jungle of Bad Estimates
What is the EKF
The Extended Kalman Filter (EKF) is ArduPilot’s sophisticated attitude and position estimation system which takes in sensor input from, but not limited to, accelerometer, gyroscope, magnetometer, barometer, GPS, etc. and provides a 24-state estimate of the vehicle. State estimation is how we localize a vehicle in the environment and also keep track of biases of the sensors. The EKF algorithm is an ‘extended’ version of the linear Kalman Filter for non-linear models. Even though being a suboptimal estimator due to the linearization of the state transition and observation model, it is still used as the de-facto standard in Inertial Navigation Systems.
EKF lanes
Modern day vehicles have multiple sensors among which redundant IMUs are extremely common, in some cases built into the autopilot hardware. This practice has several benefits such as, backup in unexpected hardware failure, tolerance for different frequencies of vibration, to name a few. For each available IMU, a unique EKF instance is created by the autopilot. One important thing to note is that not all lanes provide state estimates at the same time step due to CPU job scheduling for load levelling. The amount of CPU time available for EKF updates is controlled so other critical components of the autopilot system do not suffer.
Switching between lanes
Sensors, as empowering as they are in measuring processes and landmarks, are inherently noisy and susceptible to disturbances from the environment. Errors in IMUs can accumulate quickly making the EKF output erroneous beyond safe flight. Now before we trigger a failsafe and decide to land the vehicle, we can look for an alternative healthy EKF lane with lower error to switch to. This is important in mission critical situations where aborting can be detrimental. The key challenge for efficient lane switching is getting the right balance between being too aggressive and switching fast for the slightest of errors and being too slow to not switch even when a better lane is available that could continue the mission. Some situations which would invoke a lane switch decision are -
-
Large change in IMU bias, common in long flights with low-grade sensors
-
Two different barometers, significant drift in one due to high temperature
-
High magnetic interference, different rates of coping in magnetometers
-
Two GPS modules, one drifting badly in height
Current ArduPilot Switching Logic
The lane switching mechanism is a part of the subroutine which updates the EKF filter states and is called based on the SCHED_LOOP_RATE parameter which sets the main vehicle loop rate (there is also an explicit way to invoke a lane switch which we will address later). The core selection mechanism is ready to use after the primary lane has stabilised after initialization. This avoids triggering the switching mechanism due to alignment fluctuations and race conditions as the vehicle boots. The actual lane switching mechanism executes in 2 steps -
Looking for an alternative lane - This is executed when either the current lane’s error score is too high or if its status is unhealthy. Then we check for other lanes which are considered available for selection if and only if their status is healthy and their state estimate has been updated on the current time step.
Switching to a lane - If we have found an eligible alternate core, then we can switch if the current primary is unhealthy or despite being healthy an alternate lane’s error-score is significantly lower (specifically, less than 67%) than the current primary’s error-score.
One clear issue with the above logic is the formation of a blind spot when an alternative lane exists, but has an error score that is just above (or below, as you see it) the required threshold to consider it for switching to, hence the switch not made. As exaggerated it may sound, but this can actually crash a vehicle if the following failsafe does not work correctly, or landing may not be feasible. A lane switch could have saved the day.
Smarter Lane Switching
Well, you know what they say, some man’s problem is another man’s GSoC Project. This summer we aim to robustify the lane switching mechanism by moving beyond fixed thresholds and incorporate more factors like Integrated Errors (accumulated error scores with time, providing more context to switch), state variances (errors in state estimates), sensor innovations (differences between predicted and observed state values), state difference between lanes (to know there is a problem), to name a few.
On a second line of thought, instead of switching the entire lane, it would be more efficient to identify and swap out the particular sensor that is corrupting the active lane. This would be useful when there are sensors of mixed performance classes installed in the vehicle and multiple available EKF lanes (consider a vehicle which has only 1 high-performance GPS and is attached to the primary lane whose barometer has gone bad, hence it would be ideal to switch the barometer rather than the entire EKF lane, so you don’t lose your best GPS). This extends the idea of lanes from IMU to non-IMU sensors and we call it sensor affinity.
During the course of my project, I will also create additional blogs to decode the intricacies of the EKF mechanism built in ArduPilot. This will assist readers in understanding this project better and spike interest in the practical design of an AutoPilot State Estimation system to invite more contributors (cries for help, cryptically).
Stay tuned as I will keep posting logs, results and different simulation setups used in this project and I open this blog to your valuable feedback, suggestions and crash stories.