Hello Ardupilot family!
I’m absolutely thrilled and beyond excited to announce that my proposal, “Non-GPS Position Estimation Using 3D Camera and Pre-Generated Map,” has been accepted by ArduPilot for Google Summer of Code 2025!
Why this project matters
• Heavy-sensor stacks (LiDAR, RTK) deliver top-tier accuracy but weigh drones down, stealing flight time and agility.
• Lightweight UAVs fly longer yet usually sacrifice precision.
I’m building a middle road: tiny drones with a simple 3D camera will reuse detailed LiDAR maps created by bigger rigs, achieve sub-centimetre localisation, and optionally refine those maps as they fly.
One of the endless applications
Picture warehouse storage and retrieval. GPS is useless, markers get hidden, and lighting is patchy. A fleet of nimble drones will zip through aisles without bulky sensors and never lose their place.
Timeline
- June 2nd – June 30th: Simulation
- Week 1 (June 2–8):
- Environment & toolchain setup
- Add a 3D camera to the existing ardupilot ROS2 simulation
- Complete Barebones initial simulation setup
- Week 2 (June 9–15):
- Add a few more relevant worlds in the simulation for testing
- Develop a Point cloud preprocessing node
- Finalize a SLAM algorithm to be used for the simulation setup (Please note your suggestions in the comments for LIDAR inertial SLAM algorithms)
- Start building the initial skeleton for the framework that takes in Lidar Point cloud as input and outputs 6-DOF pose in desired frames and at desired rates
- Week 3 (June 16–22):
- Use the SLAM algorithm to generate 3D maps that we can use to test the setup
- Try different Scan registration approaches and benchmark for frequency, accuracy, and resource requirement
- Week 1 (June 2–8):
- Week 4 (June 23–30):
- Start building the statemachine that is responsible for taking a 6-DOF pose from the registration application and sending it to Ardupilot
- Evaluate performance metrics and debug critical issues
Questions
- What is the advantage of this approach over using approaches like VINS Mono or VINS fusion?
Visual SLAM is undoubtedly a good option to choose if we are tackling a completely unknown region with no prior maps or information. In case a prior 3D map is available, this approach will always outgun a visual SLAM framework for a couple of fundamental reasons- Map-based localization treats the LiDAR map as ground truth. You only solve for the 6 DOF pose of the Drone by registering its 3D‐camera point cloud against that fixed map. Therefore virtually zero long‐term drift.
- Works well in textureless environments (LiDAR doesn’t rely on visual features).
- Scalability, A single map can be reused by n drones to localize
- No scale drift issues that plague monocular SLAM
- Almost always, VIO or VSLAM will need more resources than a scan registration approach
- sub-cm accuracy without reliance on perfect calibration
I will keep this post up to date with relevant updates in timelines and make a new post with details for part 2. Apart from the above-mentioned tasks, I will focus on rigorous documentation for each step for reproducible results
Enormous thanks to @rmackay9 Ryan Friedman, @snktshrma Tejal Barnwal, and the entire ArduPilot team, along with Google Summer of Code, for believing in this idea and offering constant support. A special shout-out to Prof. Guoquan (Paul) Huang for generously agreeing to mentor me throughout the journey. I’m thrilled to give back to the open-source community that has taught me so much.
Let’s collaborate
If you work on visual SLAM, aerial robotics, warehouse automation, or lightweight autonomy, I’d love to trade ideas, learn from your challenges, and maybe team up.
Here’s to a summer filled with code, maps, and plenty of test flights