GSoC 2024 wrapping up: High Altitude Non-GPS Navigation

Hi AP community! With the conclusion of this year’s Google Summer of Code, I’m excited to share my progress on my GSoC 2024 project.

Project Description
The objective of this project was to develop a High-Altitude Non-GPS navigation technique for UAVs, enabling them to estimate their positions and navigate effectively at high altitudes without relying on GPS. This project involved extensive research alongside intensive development. Multiple research papers were reviewed, and various computer vision and control filter algorithms were tested. Through these efforts, a semi-stable algorithm was successfully developed to navigate in a known environment at high altitudes.

Work Progress and Development
Below is a summary of the progress made during the GSoC program:

1. Development of a High-Altitude Non-GPS Navigation Technique Using SIFT Feature Matching and ORB-Based Map Generation
A stable algorithm has been developed that utilizes SIFT (Scale-Invariant Feature Transform) for real-time feature detection and matching to estimate the UAV’s position. This approach requires bottom facing gimbal stabilized cameras.

The algorithm generates a map by stitching together images from a dataset of terrain images in a known environment. Real-time camera frames are then matched with the created map to estimate current position. A pipeline is formulated to correct image distortions and enhance the feature matching process, ensuring that the UAV’s positional estimates remain accurate and reliable. This includes applying color corrections, geometric corrections and refining image alignment techniques to compensate for camera lens distortions and varying perspectives. Optimizations are also employed to get the required rate of output from the feature matching to satisfy the EKF (Minimum 4hz).
The final implementation enables the UAV to maintain stable position in environments where GPS signals are unavailable, significantly improving its capability for high-altitude operations.

Here’s a video of the implementation:

2. Optical Flow Method
During the program, efforts were made to implement an optical flow-based algorithm to estimate the UAV’s position relative to its home location. A modified version of the Lucas-Kanade method was implemented to enhance the efficiency of the computed flow vectors. This approach also requires the same camera setup as used in the SIFT-based method.

The modifications include using a multi-scale pyramid representation to capture larger movements and using outlier rejection techniques, such as RANSAC, to filter out noise in the flow vectors. In upcoming versions of this algorithm, integrating IMU data will also be implemented to improve position accuracy.

As soon as the drone takes off, flow is calculated, and the data is passed to the firmware using one of the two methods:
a. By directly feeding the flow values to firmware using OPTICAL_FLOW MAVlink message.

b. By calculating the position using heading data and calculated average flow vectors on the offboard computer and feeding as vision position estimates to the AP.

Above methods have shown promising results when tested on pre-recorded video frames captured by a drone and are currently being implemented in the simulation.

Thanks to @timtuxworth for providing these videos

To further improve the algorithm, feature matching and optical flow is fused. This approach is beneficial for maintaining the EKF’s accuracy by using optical flow to provide continuous position estimates at a high rate, while feature matching is employed to periodically reset the position estimates from optical flow. This helps to prevent drift that can occur due to the inherent limitations of optical flow.

c. Location interpolation based on GPS EXIF data
The first implementation (described in point ‘a’) captures the terrain image at the initial takeoff position and uses that as a reference to calculate the local position. This approach leverages the GPS EXIF data from known location image dataset, combined with feature matching, to interpolate the position of an unknown location (real-time camera frame). When a camera frame is received, the algorithm searches for images in the dataset that might contain references matching the unknown image frame. The GPS EXIF data from these known images is then used to estimate the position of the unknown image through triangulation, providing absolute location data (latitude and longitude) for the drone.

While this technique is highly beneficial for estimating the absolute coordinates of the drone, it is too slow to meet the requirements of the EKF. A more efficient way to utilize this data is through dead reckoning. ArduPilot provides dead reckoning capabilities using wind estimates, which allows the copter to maintain stability for up to 30 seconds before becoming highly unstable.

To enhance stability and long duration flight, the calculated absolute location coordinates can be used to reset the EKF position externally using the MAV_CMD_EXTERNAL_POSITION_ESTIMATE message. This approach helps maintain the drone’s flight over longer durations and even enables navigation by resetting the EKF periodically, preventing drift over time. As highlighted in this pull request by Tridge, this method can handle much longer delays than normal, and alternative methods like Vision pose lag would fall beyond the fusion horizon. Importantly, this reset method only affects the position and leaves the velocity unaffected.

This method is still undergoing testing, and I will keep this thread updated with further developments.

Findings
Based on the research and development done on this project, some useful findings are listed below:

1. SIFT vs. ORB

  • SIFT proved to be highly reliable for accurate feature detection and matching, especially in scenarios with significant variation in scale and rotation. However, it comes with high computational cost and slower performance, which can be a limitation in real-time applications. SIFT is highly recommended in dynamic environments and setups where gimbal is not present.

  • ORB (Oriented FAST and Rotated BRIEF), while faster and less computationally intensive than SIFT, showed lower accuracy in challenging lighting conditions and when dealing with significant distortions. ORB is preferable in static environments, good resolution and gimbal based setups.

  • Akaze offers a balance between speed and accuracy, being more computationally efficient than SIFT and providing better feature matching than ORB in some cases. It could be a good alternative to the ORB based position estimation.

  • Neural Network-based Descriptors (e.g., SuperPoint) can offer high accuracy and robustness but require more processing power. These could be explored for future implementations, especially in high-performance UAVs equipped with powerful onboard computers like Jetson.

2. Integration of Optical Flow with Feature Matching

  • Fusing optical flow with feature matching significantly enhances navigation stability. Optical flow alone is effective for high-rate position estimates, but it tends to drift over time. By periodically resetting the position estimate using feature matching, the overall drift can be minimized, making the system more reliable for longer-duration flights.

3. IMU-based Interpolation for position estimation

  • Using IMU-based interpolation helps provide continuous position and orientation estimates between known states by leveraging high-rate IMU data. This approach fills the gaps between more accurate but less frequent position updates from GPS or feature matching, reducing drift and improving overall stability. By integrating IMU measurements, this method ensures smoother navigation and maintains reliable state estimation, especially in environments where direct positioning data is intermittent or unavailable.

Future Work

1. Hierarchical Localization Optimization
Incorporating a hierarchical localization approach can further enhance the system’s scalability and efficiency. By using a coarse-to-fine localization strategy, the UAV can quickly narrow down the search area using broader, less detailed information before applying detailed feature matching for precise localization.

It can be implemented using a two-step process involving coarse-level localization using a vector of locally aggregated descriptors, followed by fine-level localization using SIFT features. This hierarchical approach will streamline the matching process by quickly narrowing down the search area and then applying detailed feature matching to improve accuracy and computational efficiency.

2. Integration of Approximate Nearest Neighbour (ANN) Search Techniques
Implementing ANN search methods like Product Quantization (PQ) and Inverse File (IVF) can accelerate the matching process between real-time camera frames and the pre-computed map. These techniques reduce the complexity and memory requirements, making real-time processing more feasible. Instead of relying on a stitched map, the database will store visual words and descriptors, minimizing memory requirements.
These ANN methods use hashing techniques to optimize feature matching by efficiently indexing and retrieving feature vectors, enabling faster searches.

Reference to the approach can be found in this thesis. Thanks to @radiant_bee for mentioning this thesis.

I want to extend my thanks to my incredible mentors, @rmackay9 and @tridge, for their invaluable guidance. I also want to thank @rhys for his immense support in the gimbal testing and simulation parts.

Related Pull requests and contributions

This project is a work in progress, and I am committed to its successful completion, continuously refining and improving the techniques to achieve reliable high-altitude, non-GPS navigation for UAVs.

12 Likes

Hi @snktshrma, do you know at what altitude the video datasets used for Optical flow were captured?
Further, would it be possible to share the dataset?

1 Like

Hi @radiant_bee! These datasets were captured at an altitude of 75m. I am also testing with a video dataset captured at altitudes above 100m and getting similar results.

Sure! I’ll be making these open-source soon after preliminary testing is completed.

Excellent work @snktshrma , there are tons of wonderful applications for this work. Have you tried running ORB and SIFT on two different processes and combining the results?

1 Like

Hi @rfriedman ! Thanks a lot : )
Though I have not yet tried running ORB and SIFT simultaneously and combine the results, but I’d love to know more about this approach and will try to implement it on my end.
Is there any benefit doing this @rfriedman ? What I can think of is that it will help me get data at a constant rate from ORB and SIFT data I can use to backtrack and correct the errors from ORB or something like that. Is that so ?

Yea exactly. That’s what I had in mind.

Cool! I’ll give it a try and update the results
Thanks for the suggestion @rfriedman

Thanks for your sharing!Sorry, i followed ap_nongps instructions ,and in Terminal 2 it shows the message “The parameter file (/home/cjr561535/ardupilot_gazebo_ap/config/gazebo-iris-gimbal-ngps.parm) does not exist”,well i can’t find ardupilot_gazebo_ap

Hi @JR_C ! Thanks for pointing out. Here’s the param file: ardupilot_gazebo_ap/config at gsoc-arena · snktshrma/ardupilot_gazebo_ap · GitHub

I’ll update the same in repo’s README.

Hi everyone,
Here’s an update to the Optical Flow + Feature Matching approach.

In brief, this method combines an enhanced version of the Lucas-Kanade (LK) optical flow algorithm for high-frequency position estimation with SIFT feature matching. The LK algorithm is used to estimate the movement between consecutive frames, but it drifts over time, especially in long sequences. To correct this drift overtime, we use SIFT-based feature matching along with a simplified ANN algorithm, which helps in correcting this drift over time.

This approach performs very well, even when noise is introduced, as the feature matching corrects the drift overtime and keep it within manageable range.

However, there are still some limitations to this approach. The current method struggles with handling fast rotations, as the LK algorithm’s performance degrades during fast rotational movements. Also, in cases of significant perspective changes, the drift correction through feature matching is not very effective.

I’ll be releasing the code in ap_nongps repository as soon as these issues are addressed.

I’d love to hear the feedback on this approach. I’d be happy to discuss any questions or suggestions!

Thank you!

Do u have any idea about this problem:happens when i followed the instructions Terminal 1

I have solved the problem above:I haven’t install Gstreamer first.
But now when i run the video_to_feature.py file ,it only shows the first row Heartbeat from …


i dont know how to solve this

Hi @JR_C ! Sorry that you are facing problems setting it up!

It seems likely that the problem arises because the script is unable to access the camera frames. Could you please double-check if everything is set up correctly?

Thank you for pointing out these issues! I’ll add these troubleshooting to the main repo as well!

I’ll work on creating a bash script to automate the setup process!

MAN!I find in the video_to_feature.py :
stuck in this
getComp() → check_takeoff_complete(10) → msg = the_connection.recv_match(type=‘GLOBAL_POSITION_INT’, blocking=True)
deep into recv_match found that msg’s type is LOCAL_POSITION_NED,so it will stuck in the recv_match
now i dont know what to do ,do u have any advice?

Hi @JR_C ! Thanks for pointing out! I am looking into it!

Hi, I wonder why haven’t you started with some VIO framework. Matching with satellite images seems like a problem similar to loop-closure detection. BTW EKF3 is written in SymForce which is also capable of being a VIO backend. Having frontend accelerated by camera (something OAK-like)? That would be fun!

1 Like

Hi, i solved the problem for now,i changed the position message type from “GLOBAL POSITION INT” into “LOCAL POSITION NED” and then everyathing works out just fine.
Now,I want to use my school’s satellite image,i searched the Internet about creating ground plane satellite image with Gazebo and found out the their gazebo version is too old(now i use Harmonic),and there is no such doc about this,do you have any advice about this?How can i creat a ground plane model

1 Like

Hi @soldierofhell, thanks for the comment!

The VIO framework is a great idea, and I’ve been exploring optical flow approaches, which are somewhat similar. The main difference is that optical flow requires a lot of off-board processing and relies on ArduPilot’s EKF to fuse IMU and optical flow data (as you mentioned with SymForce and the VIO backend). Using cameras like OAK could definitely simplify this process and make real-time odometry estimation more feasible while letting the EKF handle the rest.

I might not be absolutely correct with my understanding of current approach of optical flow as a VIO, and I’d be happy to discuss further.

This project initially started as an alternative RTL approach for when the drone loses GPS mid-flight and needs to return home. That’s why I began with feature matching for global localization to help guide the UAV back to the home location.

As for loop closure and SLAM on a global scale, I’ve been working on this based on an existing research paper. The challenge is achieving real-time performance. Backtracking satellite images to camera frames for global localization is still feasible with some processing delay, but real-time drift-compensated pose estimation without GPS will take more development and research.

I’d love to hear more of your thoughts and ideas on this.

Thank you.

Hi @JR_C ! Good to know it worked.

As a workaround for your requirement, I suggest you use USGS(https://earthexplorer.usgs.gov/) to get a high quality satellite image of your school area and modify arena model I provided and replace the satellite image there with you school’s.

You’ll find that package here: ardupilot_gazebo_ap/models/arena/meshes/satellite_image.png at gsoc-arena · snktshrma/ardupilot_gazebo_ap · GitHub

Let me know if that works. Either way, let me know the coordinates and I’ll get you the arena model for your school area.

OK,it works,thanks a lot

1 Like