GSoC 2024 wrapping up: High Altitude Non-GPS Navigation

snktshrma · August 25, 2024, 7:20pm

Hi AP community! With the conclusion of this year’s Google Summer of Code, I’m excited to share my progress on my GSoC 2024 project.

Project Description
The objective of this project was to develop a High-Altitude Non-GPS navigation technique for UAVs, enabling them to estimate their positions and navigate effectively at high altitudes without relying on GPS. This project involved extensive research alongside intensive development. Multiple research papers were reviewed, and various computer vision and control filter algorithms were tested. Through these efforts, a semi-stable algorithm was successfully developed to navigate in a known environment at high altitudes.

Work Progress and Development
Below is a summary of the progress made during the GSoC program:

1. Development of a High-Altitude Non-GPS Navigation Technique Using SIFT Feature Matching and ORB-Based Map Generation
A stable algorithm has been developed that utilizes SIFT (Scale-Invariant Feature Transform) for real-time feature detection and matching to estimate the UAV’s position. This approach requires bottom facing gimbal stabilized cameras.

The algorithm generates a map by stitching together images from a dataset of terrain images in a known environment. Real-time camera frames are then matched with the created map to estimate current position. A pipeline is formulated to correct image distortions and enhance the feature matching process, ensuring that the UAV’s positional estimates remain accurate and reliable. This includes applying color corrections, geometric corrections and refining image alignment techniques to compensate for camera lens distortions and varying perspectives. Optimizations are also employed to get the required rate of output from the feature matching to satisfy the EKF (Minimum 4hz).
The final implementation enables the UAV to maintain stable position in environments where GPS signals are unavailable, significantly improving its capability for high-altitude operations.

Here’s a video of the implementation:

2. Optical Flow Method
During the program, efforts were made to implement an optical flow-based algorithm to estimate the UAV’s position relative to its home location. A modified version of the Lucas-Kanade method was implemented to enhance the efficiency of the computed flow vectors. This approach also requires the same camera setup as used in the SIFT-based method.

The modifications include using a multi-scale pyramid representation to capture larger movements and using outlier rejection techniques, such as RANSAC, to filter out noise in the flow vectors. In upcoming versions of this algorithm, integrating IMU data will also be implemented to improve position accuracy.

As soon as the drone takes off, flow is calculated, and the data is passed to the firmware using one of the two methods:
a. By directly feeding the flow values to firmware using OPTICAL_FLOW MAVlink message.

b. By calculating the position using heading data and calculated average flow vectors on the offboard computer and feeding as vision position estimates to the AP.

Above methods have shown promising results when tested on pre-recorded video frames captured by a drone and are currently being implemented in the simulation.

Thanks to @timtuxworth for providing these videos

To further improve the algorithm, feature matching and optical flow is fused. This approach is beneficial for maintaining the EKF’s accuracy by using optical flow to provide continuous position estimates at a high rate, while feature matching is employed to periodically reset the position estimates from optical flow. This helps to prevent drift that can occur due to the inherent limitations of optical flow.

c. Location interpolation based on GPS EXIF data
The first implementation (described in point ‘a’) captures the terrain image at the initial takeoff position and uses that as a reference to calculate the local position. This approach leverages the GPS EXIF data from known location image dataset, combined with feature matching, to interpolate the position of an unknown location (real-time camera frame). When a camera frame is received, the algorithm searches for images in the dataset that might contain references matching the unknown image frame. The GPS EXIF data from these known images is then used to estimate the position of the unknown image through triangulation, providing absolute location data (latitude and longitude) for the drone.

While this technique is highly beneficial for estimating the absolute coordinates of the drone, it is too slow to meet the requirements of the EKF. A more efficient way to utilize this data is through dead reckoning. ArduPilot provides dead reckoning capabilities using wind estimates, which allows the copter to maintain stability for up to 30 seconds before becoming highly unstable.

To enhance stability and long duration flight, the calculated absolute location coordinates can be used to reset the EKF position externally using the MAV_CMD_EXTERNAL_POSITION_ESTIMATE message. This approach helps maintain the drone’s flight over longer durations and even enables navigation by resetting the EKF periodically, preventing drift over time. As highlighted in this pull request by Tridge, this method can handle much longer delays than normal, and alternative methods like Vision pose lag would fall beyond the fusion horizon. Importantly, this reset method only affects the position and leaves the velocity unaffected.

This method is still undergoing testing, and I will keep this thread updated with further developments.

Findings
Based on the research and development done on this project, some useful findings are listed below:

1. SIFT vs. ORB

SIFT proved to be highly reliable for accurate feature detection and matching, especially in scenarios with significant variation in scale and rotation. However, it comes with high computational cost and slower performance, which can be a limitation in real-time applications. SIFT is highly recommended in dynamic environments and setups where gimbal is not present.
ORB (Oriented FAST and Rotated BRIEF), while faster and less computationally intensive than SIFT, showed lower accuracy in challenging lighting conditions and when dealing with significant distortions. ORB is preferable in static environments, good resolution and gimbal based setups.
Akaze offers a balance between speed and accuracy, being more computationally efficient than SIFT and providing better feature matching than ORB in some cases. It could be a good alternative to the ORB based position estimation.
Neural Network-based Descriptors (e.g., SuperPoint) can offer high accuracy and robustness but require more processing power. These could be explored for future implementations, especially in high-performance UAVs equipped with powerful onboard computers like Jetson.

2. Integration of Optical Flow with Feature Matching

Fusing optical flow with feature matching significantly enhances navigation stability. Optical flow alone is effective for high-rate position estimates, but it tends to drift over time. By periodically resetting the position estimate using feature matching, the overall drift can be minimized, making the system more reliable for longer-duration flights.

3. IMU-based Interpolation for position estimation

Using IMU-based interpolation helps provide continuous position and orientation estimates between known states by leveraging high-rate IMU data. This approach fills the gaps between more accurate but less frequent position updates from GPS or feature matching, reducing drift and improving overall stability. By integrating IMU measurements, this method ensures smoother navigation and maintains reliable state estimation, especially in environments where direct positioning data is intermittent or unavailable.

Future Work

1. Hierarchical Localization Optimization
Incorporating a hierarchical localization approach can further enhance the system’s scalability and efficiency. By using a coarse-to-fine localization strategy, the UAV can quickly narrow down the search area using broader, less detailed information before applying detailed feature matching for precise localization.

It can be implemented using a two-step process involving coarse-level localization using a vector of locally aggregated descriptors, followed by fine-level localization using SIFT features. This hierarchical approach will streamline the matching process by quickly narrowing down the search area and then applying detailed feature matching to improve accuracy and computational efficiency.

2. Integration of Approximate Nearest Neighbour (ANN) Search Techniques
Implementing ANN search methods like Product Quantization (PQ) and Inverse File (IVF) can accelerate the matching process between real-time camera frames and the pre-computed map. These techniques reduce the complexity and memory requirements, making real-time processing more feasible. Instead of relying on a stitched map, the database will store visual words and descriptors, minimizing memory requirements.
These ANN methods use hashing techniques to optimize feature matching by efficiently indexing and retrieving feature vectors, enabling faster searches.

Reference to the approach can be found in this thesis. Thanks to @radiant_bee for mentioning this thesis.

I want to extend my thanks to my incredible mentors, @rmackay9 and @tridge, for their invaluable guidance. I also want to thank @rhys for his immense support in the gimbal testing and simulation parts.

Related Pull requests and contributions

github.com/ArduPilot/ardupilot_gazebo

Iris: update iris_with_gimbal model to 3d gimbal

ArduPilot:main ← srmainwaring:prs/pr-iris-gimbal-3d

opened 02:51PM - 27 Jun 24 UTC

srmainwaring

+81 -54

Updates https://github.com/ArduPilot/ardupilot_gazebo/pull/86 which was on a `ma…in` branch preventing maintainer edits. - Update the gimbal model used by the Iris to 3d and add yaw control channel. - Use the `iris_with_gimbal` model as default in example worlds. - Reduce position controller PID gains to reduce potential oscillation. - Update channels to match servo gimbal wiki doc: https://ardupilot.org/copter/docs/common-camera-gimbal.html - Update servo control direction | Action | Channel | RC Low | RC High | Min (deg) | Max (deg) | | --- | --- | --- | --- | --- | --- | | Roll | RC6 | Roll Left | Roll Right | -30 | +30 | | Pitch | RC7 | Pitch Dn | Pitch Up | -135 | +45 | | Yaw | RC8 | Yaw Left | Yaw Right | -160 | 160 | Copter params ```bash # Iris is X frame FRAME_CLASS 1 FRAME_TYPE 1 # Match servo out for motors MOT_PWM_MIN 1100 MOT_PWM_MAX 1900 # Gimbal on mount 1 has 3 DOF MNT1_TYPE 1 # Servo MNT1_PITCH_MAX 45 MNT1_PITCH_MIN -135 MNT1_ROLL_MAX 30 MNT1_ROLL_MIN -30 MNT1_YAW_MAX 160 MNT1_YAW_MIN -160 # Gimbal RC in RC6_MAX 1900 RC6_MIN 1100 RC6_OPTION 212 # Mount1 Roll RC7_MAX 1900 RC7_MIN 1100 RC7_OPTION 213 # Mount1 Pitch RC8_MAX 1900 RC8_MIN 1100 RC8_OPTION 214 # Mount1 Yaw RC8_REVERSED 0 # Normal RC9_MAX 1900 RC9_MIN 1100 RC9_OPTION 0 # Do Nothing # Gimbal servo out SERVO9_FUNCTION 8 # Mount1Roll SERVO10_FUNCTION 7 # Mount1Pitch SERVO11_FUNCTION 6 # Mount1Yaw ``` _Figure: gimbal tracking ROI located at the home position while Iris circles_ ![iris-gimbal-3d-roi](https://github.com/ArduPilot/ardupilot_gazebo/assets/24916364/465604fb-a8e1-4db9-aaa2-75d6b4274938) The `GstCameraPlugin` is also enabled by default. To view the video in QGC edit `Application Settings > Video Settings`. Select `UDP h.264 Video Stream` and use the default port `5600`. ![iris-gimbal-3d-qcg](https://github.com/ArduPilot/ardupilot_gazebo/assets/24916364/3be31d56-460f-4d0a-bc73-e140a92df7fd) ## Testing Tested on macOS Sonoma 14.5 and Ubuntu 22.04 (VM). Both running ArduPilot `master` with Gazebo Harmonic.

github.com/ArduPilot/companion

non_gps: added feature matching and offset calculation

ArduPilot:master ← snktshrma:master

opened 10:17PM - 12 May 23 UTC

snktshrma

+2234 -0

This is a draft PR, referenced to the [#23471](https://github.com/ArduPilot/ardu…pilot/issues/23471) by @rmackay9. As of now, PR provides basic outline for the image feature detection and getting offsets, orientation and zoom. Alongside, we are able to extract new lat and long based on pixel offset. Functions to store old lat and log, alongside altitude data is also added. Two test images are added for testing purposes. This is a small part to the whole solution for the issue and it'll be updated frequently under this PR for the complete feature development.

This project is a work in progress, and I am committed to its successful completion, continuously refining and improving the techniques to achieve reliable high-altitude, non-GPS navigation for UAVs.

radiant_bee · August 26, 2024, 5:36am

Hi @snktshrma, do you know at what altitude the video datasets used for Optical flow were captured?
Further, would it be possible to share the dataset?

snktshrma · August 26, 2024, 3:39pm

Hi @radiant_bee! These datasets were captured at an altitude of 75m. I am also testing with a video dataset captured at altitudes above 100m and getting similar results.

Sure! I’ll be making these open-source soon after preliminary testing is completed.

rfriedman · September 11, 2024, 5:28am

Excellent work @snktshrma , there are tons of wonderful applications for this work. Have you tried running ORB and SIFT on two different processes and combining the results?

snktshrma · September 11, 2024, 4:24pm

Hi @rfriedman ! Thanks a lot : )
Though I have not yet tried running ORB and SIFT simultaneously and combine the results, but I’d love to know more about this approach and will try to implement it on my end.
Is there any benefit doing this @rfriedman ? What I can think of is that it will help me get data at a constant rate from ORB and SIFT data I can use to backtrack and correct the errors from ORB or something like that. Is that so ?

rfriedman · September 11, 2024, 8:48pm

Yea exactly. That’s what I had in mind.

snktshrma · September 12, 2024, 3:20am

Cool! I’ll give it a try and update the results
Thanks for the suggestion @rfriedman

JR_C · September 24, 2024, 12:57pm

Thanks for your sharing!Sorry, i followed ap_nongps instructions ,and in Terminal 2 it shows the message “The parameter file (/home/cjr561535/ardupilot_gazebo_ap/config/gazebo-iris-gimbal-ngps.parm) does not exist”,well i can’t find ardupilot_gazebo_ap

snktshrma · September 25, 2024, 2:30am

Hi @JR_C ! Thanks for pointing out. Here’s the param file: ardupilot_gazebo_ap/config at gsoc-arena · snktshrma/ardupilot_gazebo_ap · GitHub

I’ll update the same in repo’s README.

snktshrma · September 25, 2024, 3:55am

Hi everyone,
Here’s an update to the Optical Flow + Feature Matching approach.

In brief, this method combines an enhanced version of the Lucas-Kanade (LK) optical flow algorithm for high-frequency position estimation with SIFT feature matching. The LK algorithm is used to estimate the movement between consecutive frames, but it drifts over time, especially in long sequences. To correct this drift overtime, we use SIFT-based feature matching along with a simplified ANN algorithm, which helps in correcting this drift over time.

This approach performs very well, even when noise is introduced, as the feature matching corrects the drift overtime and keep it within manageable range.

However, there are still some limitations to this approach. The current method struggles with handling fast rotations, as the LK algorithm’s performance degrades during fast rotational movements. Also, in cases of significant perspective changes, the drift correction through feature matching is not very effective.

I’ll be releasing the code in ap_nongps repository as soon as these issues are addressed.

I’d love to hear the feedback on this approach. I’d be happy to discuss any questions or suggestions!

Thank you!

JR_C · September 25, 2024, 1:01pm

Do u have any idea about this problem:happens when i followed the instructions Terminal 1

JR_C · September 26, 2024, 2:37am

I have solved the problem above:I haven’t install Gstreamer first.
But now when i run the video_to_feature.py file ,it only shows the first row Heartbeat from …

i dont know how to solve this

snktshrma · September 26, 2024, 10:41am

Hi @JR_C ! Sorry that you are facing problems setting it up!

It seems likely that the problem arises because the script is unable to access the camera frames. Could you please double-check if everything is set up correctly?

Thank you for pointing out these issues! I’ll add these troubleshooting to the main repo as well!

I’ll work on creating a bash script to automate the setup process!

JR_C · September 27, 2024, 9:20am

MAN!I find in the video_to_feature.py :
stuck in this
getComp() → check_takeoff_complete(10) → msg = the_connection.recv_match(type=‘GLOBAL_POSITION_INT’, blocking=True)
deep into recv_match found that msg’s type is LOCAL_POSITION_NED,so it will stuck in the recv_match
now i dont know what to do ,do u have any advice?

snktshrma · September 29, 2024, 4:17pm

Hi @JR_C ! Thanks for pointing out! I am looking into it!

soldierofhell · September 30, 2024, 7:49am

Hi, I wonder why haven’t you started with some VIO framework. Matching with satellite images seems like a problem similar to loop-closure detection. BTW EKF3 is written in SymForce which is also capable of being a VIO backend. Having frontend accelerated by camera (something OAK-like)? That would be fun!

JR_C · September 30, 2024, 9:05am

Hi, i solved the problem for now,i changed the position message type from “GLOBAL POSITION INT” into “LOCAL POSITION NED” and then everyathing works out just fine.
Now,I want to use my school’s satellite image,i searched the Internet about creating ground plane satellite image with Gazebo and found out the their gazebo version is too old(now i use Harmonic),and there is no such doc about this,do you have any advice about this?How can i creat a ground plane model

snktshrma · October 4, 2024, 7:38pm

Hi @soldierofhell, thanks for the comment!

The VIO framework is a great idea, and I’ve been exploring optical flow approaches, which are somewhat similar. The main difference is that optical flow requires a lot of off-board processing and relies on ArduPilot’s EKF to fuse IMU and optical flow data (as you mentioned with SymForce and the VIO backend). Using cameras like OAK could definitely simplify this process and make real-time odometry estimation more feasible while letting the EKF handle the rest.

I might not be absolutely correct with my understanding of current approach of optical flow as a VIO, and I’d be happy to discuss further.

This project initially started as an alternative RTL approach for when the drone loses GPS mid-flight and needs to return home. That’s why I began with feature matching for global localization to help guide the UAV back to the home location.

As for loop closure and SLAM on a global scale, I’ve been working on this based on an existing research paper. The challenge is achieving real-time performance. Backtracking satellite images to camera frames for global localization is still feasible with some processing delay, but real-time drift-compensated pose estimation without GPS will take more development and research.

I’d love to hear more of your thoughts and ideas on this.

Thank you.

snktshrma · October 4, 2024, 7:45pm

Hi @JR_C ! Good to know it worked.

As a workaround for your requirement, I suggest you use USGS(https://earthexplorer.usgs.gov/) to get a high quality satellite image of your school area and modify arena model I provided and replace the satellite image there with you school’s.

You’ll find that package here: ardupilot_gazebo_ap/models/arena/meshes/satellite_image.png at gsoc-arena · snktshrma/ardupilot_gazebo_ap · GitHub

Let me know if that works. Either way, let me know the coordinates and I’ll get you the arena model for your school area.

JR_C · October 8, 2024, 2:31am

OK，it works，thanks a lot