Hi everyone! It’s me, Asif Khan . I’m really happy to share with you the final work I’ve done during Google Summer of Code 2024 for project Visual Follow Me .
First and foremost, I want to express my gratitude to my mentors, @rmackay9 and @peterbarker . They’ve been incredibly supportive and helpful throughout this project, and without their guidance, my work wouldn’t be where it is today.
Description:
This project aims to add support in Ardupilot to track objects in a frame and then follow them, where Gimbal-Camera follow is the first step and drone following is the second. The ability to accurately follow and capture moving subjects opens up new opportunities for advanced applications in search and rescue missions, wildlife monitoring, and autonomous delivery systems. Enhanced object tracking capabilities can also improve security and traffic management, providing real-time data and insights.
There are two major part of this project
- The Algorithm to track things
- A message pipeline to get things working
- Pitch and Yaw Rate Controller
Let us first talk about the algorithm
Algorithms i tried and explored in between of this GSoC period
- YOLO based object tracking with minimum difference of image vectors
- KCF (Kernelized Correlation Filters)
- CSRT (Channel and Spatial Reliability Tracking)
YOLO
It was the first algorithm i come up with, YOLO (You Only Look Once) is a real-time object detection algorithm that detects and classifies multiple objects in an image with a single pass through a neural network. It’s known for its speed and accuracy, making it popular for applications requiring fast object detection, such as autonomous vehicles and video surveillance.
Idea is here:
flaws:
- Its heavy to take diiferences of NP array all the time
- not able to detect special patterns that may or may not be objects those are already trained.
KCF
KCF (Kernelized Correlation Filters) is an object tracking algorithm that uses correlation filters with multi-channel features, leveraging properties of the circulant matrix to make the computation very fast in the Fourier domain. It’s efficient for scenarios with minimal changes in scale or rotation and works well in real-time applications due to its speed. However, KCF is sensitive to occlusions, non-rigid deformations, and scale variations, which can cause it to lose the target.
Pros:
- Speed: Very fast due to efficient computation in the Fourier domain.
- Real-time Performance: Suitable for real-time tracking due to its low computational cost.
- Simple Implementation: Relatively easy to implement and use in practice.
Cons:
- Sensitivity to Scale and Rotation: Struggles with significant scale changes and rotation.
- Handling Occlusions: Less robust to occlusions or when the object is partially hidden.
- Limited Feature Set: Uses basic features which can limit its performance in complex scenarios.
While i was testing KCF i saw sometimes it is not able to catchup with the tracked object when the drone take a little faster maneuvers. So i decided to give a try to CSRT.
CSRT
CSRT (Discriminative Correlation Filter with Channel and Spatial Reliability Tracking) is an advanced object tracking algorithm that improves upon the basic correlation filter approach by adding spatial reliability, which allows it to better handle scale changes and non-rigid objects. It is more robust to occlusions and deformations compared to KCF. However, this comes at the cost of being slower than KCF, making it less suitable for real-time applications where speed is crucial. CSRT offers better accuracy in challenging conditions like occlusions and scale changes.
Pros:
- Robustness to Occlusions: Better at handling occlusions and partially obscured objects.
- Improved Accuracy: More accurate in tracking objects with scale variations and non-rigid deformations.
- Spatial Reliability: Utilizes spatial reliability to improve tracking performance in challenging conditions.
Cons:
- Speed: Slower compared to KCF due to more complex computations and additional checks.
- Higher Computational Cost: Requires more processing power, which might be a drawback in real-time applications.
- Implementation Complexity: More complex to implement compared to KCF, requiring careful tuning of parameters.
While CSRT is a little slow in my case when i tested i saw it does really good in fast maneauvers of drone and object both.
Finally i decided to used CSRT for the purpose, but i am still open to suggestions to try. Thanks in Advance.
2. A Message Pipeline to get things working
There are two types of messages, first is track rectangle and second is track point, both need to be conveyed to our supercool external tracking system so that it knows what need to be tracked. To do this some new and special parameters in Arduopilot helps us forward those messages to Companion computer, those parameters are:
CAMx_TRK_ENABLE - enables or disables the tracking.
CAMx_TRK_SYSID - The mavlink system id of the external tracking system.
CAMx_TRK_COMPID - The mavlink component id of the external tracking system.
The Firmware now compares the parameter’s defined ID’s with the ID’s it found when getting register heartbeats of mavlink devices, if due to some reason those aren’t matching it discards the requests.
The Link here is Actually a network switch similar to this Botblox Nano
3. Pitch and Yaw Rate controller
Now we have all the messages and our algorithm, we can use that data to give the gimbal commands to move. The magic here is that for this purpose we dont need any fancy gimbal camera, even the servo gimbals are supported. I used the RATE commands to control the gimbal. The simple workflow is showed in this diagram
PULL requests related to this work are here
TESTINGS
Everything is readily available to test on simulation though the hardware testing will take a little more time i will update the progress here.
For testing the PR with gazebo just take a look at example scripts and a ReadME here
Here is the testing video for the simulation
I am really grateful for the partners who donated hardware for this project, I am currently building it its 70% compeleted. These are the things i got -
- Nvidia Jetson baseboard, And X650 frame Kit
- Herelink v1.1 Airunit and groundunit
- Tattu 10000mah battery
- SiYi A8
Here is the funding proposal
Well thanks for reading the blog this far. I would like to receive more suggestions and insights, If you want something to add let me know by providing your valuable feedback.
ThankYou!