GSoC24: Visual Follow-me using AI

Introduction
Hello, I am Asif Khan from Jaipur, India. I am extremely delighted to share that I have been selected as a contributor in GSoC, 2024 in Ardupilot with @rmackay9 and @peterbarker as my mentors. I will be working on Visual Follow-me using AI project this year. In this blog i will be discussing my project in detail.

Scope Object tracking
While drones are increasingly utilized for dynamic tasks such as sports filming, event coverage, and surveillance, There is a lot of scope in field of object tracking using a Camera. The ability to accurately follow and capture moving subjects opens up new opportunities for advanced applications in search and rescue missions, wildlife monitoring, and autonomous delivery systems. Enhanced object tracking capabilities can also improve security and traffic management, providing real-time data and insights.

Problem Statement
Majorly this project is divided into 2 parts Those are:

1. Adding Visual Follow-me support for all ArduPilot camera gimbals

The gimbal will follow the target the process is as follows-

  1. The GCS will send a MAV_CMD_TRACK_POINT or MAV_CMD_TRACK_RECTANGLE messages to the FC.
  2. The FC will then pass the message through a set of libraries and will forward the request to companion computer.
  3. The companion is responsible to track the point and send the roll pitch and yaw commands. a separate controller script will handle the control algorithm.
  4. The commands thus got will be fed into the fc in form of mavlink messages to control pitch and yaw and would try to put the object into the image center. see here the message definition GIMBAL_MANAGER_SET_PITCHYAW

2. Making the vehicle to follow the object
In this the drone will follow the target object. ArduPilot already supports calculating the lat, lon and altitude of what the camera gimbal is pointing at. So by using that we can easily get the lat lon and alt of the target, and feeding that information to the follow mode will eventually make the drone to follow the point.

Here is an ideal message flow diagram which may be helpful to understand the flow (this may change a bit with time).

Now to the Companion side the object tracking algorithm will be working with a PID controller calculating the pitch and yaw rate, For object tracking Yolov8 or OpenCV can be used, both are compatible with development boards which supports Linux like nvidia jetson series, RPi etc. The basic controller that i proposed will work like this -

dX is the error in the X axis and dY is the error in the Y axis. Those will be the input to a separate controller dedicated for tracking. Here dX will be input for Yaw control and dy will be input for Pitch control.

What if an object goes outside the image range ?
If it happens the last object coordinates will be used to move the gimbal but within a constant time difference. If the Object is found again the tracking will continue. But if it’s not found within a search limit then the gimbal will be stopped at the last location statically.

Well thanks for reading the blog this far. I would like to receive more suggestions and insights, If you want something to add let me know by providing your valuable feedback.

ThankYou!

7 Likes

That’s great to read GSoc24 project of Visual Follow-me using AI.

I have a couple of questions about this (Hope to hear from u):

  1. What GCS are you using for part one, does latest QGC 4.4.0 support target selection?
  2. How does drone plan to follow the target, in X-Y surface (height maintains the same level)?
  3. Does Raspberry Pi3B+ suffice to do the job(follwing the target)?
  4. What hardware configuration are you using for developing (I have a Pi3B+ copter)?
  5. Is there any simulation environment setup guide for test?

EDIT: Here is the link for AI module from SIYI: SIYI AI Tracking Module 4T Computing Power Human Vehicle Multi-Target Recognition Anti-Lost

2 Likes

Hello Daniel,
Thanks for reading, Its just an initial post so i am still determining a lot of components and approaches so the answers here may change with time, lets try to answer one by one:

  1. The feature is in the master will have to check if they have released.
  2. The Follow mode does have some functionalities to control the altitude, in X-Y plane it will follow a target just using the passed lat-lon, but with an offset defined by FOLL_OFS_X, FOLL_OFS_Y, FOLL_OFS_Z, you can have a look here.
  3. I have to check the performace on the given hardware, however i have planned to use a nvidia orin nano. However i will definately check the performace of the model on different boards i have to borrow some Pi boards for it.
  4. I am using a Orin nano.
  5. Yeah, as i have discussed @rhys we can easily setup the things on simulation using gazebo harmonic i will update here once i setup initial patch here.

I think SIYI QGC has this function, but I’m NOT sure if the code is on github

Please evaluate those Pi boards if you have the time. I hope Pi3B can do the job.

Yes, please give an update.

And thanks for you quick response. Hope to see the Visual Follow me!

.

I have been using a AMB82-mini for object tracking on rover, it has a built-in NPU so it doesn’t need a companion.

2 Likes

where can i find this?

Hi @geofrancis,

Sorry for the slow reply, I didn’t see your question.

We’ve got the calculation in two places

1 Like

I have installed QGC 4.4.0, and I didn’t see any icons related to object selection. Any idea?

I think AI tracking is kind of different from FollowMe mode. It might be some thing like below sequence(I’m NOT expect on this, I’m just learning, anything wrong, please let me know.):

  1. GCS select object detection area //MAV_CMD_CAMERA_TRACK_POINT/MAV_CMD_CAMERA_TRACK_RECTANGLE
  2. GCS send selected aera through MAVLink to FC
  3. FC transmits packet to AI module
  4. AI module detect object and feedback
    4.1 AI module feedback result to FC
    4.2 FC transmits packet to GCS
  5. AI module add an OSD box, indicating object is detected, which is show on GCS
  6. AI module execute algorithm to pitch/roll/yaw, tracking the moving object
  7. If there is an abnormal situation, such as object misiing, stop tracking // MAV_CMD_CAMERA_STOP_TRACKING

OK, Orin has 40TOPS, which is better than SIYI AI 10TOPS.

Any updates? I hope to setup an environment also, but … … If you have this setup, please let me know, thanks.

Hello! Great article!
I would like to contact you, we have a project for autonomous fire extinguishing from drones
You could help the project, what you write in your article are exactly the problems that I want to solve
You can write to me by email - kavaneural@gmail.com

This is a public forum, if you want help then ask publicly, no one is going to work just for you for free.

2 Likes

Of course not for free, what are we talking about, I didn’t even think about that
The task is the same as described in the topicstarter
But you need to automate the drone when extinguishing a fire. For example, it may lose contact when heated and must be raised automatically or moved away

@Kot_Lessay I think the project “autonomous fire extinguishin” you are doing now is more complex than GSoc24 Visual Follow-me using AI. Is it possible to create a new topic for further discussion if you need thoese experts to help?

As there are quite a lot of scenarioes which need to be consider:

  1. What type of fire extinguishing equipment should be carried?
  2. In what scenarios (oil, gas, etc.) is it suitable for fire extinguishing?
  3. How efficient is the fire extinguishing?
  4. What is the heat resistance capacity of the entire equipment?
  5. During flight, when there is smoke, what is the effectiveness of visual cameras, infrared cameras, and radar?
  6. How long can the equipment operate while carrying it?
  7. What about fire rescue in environments like narrow spaces in buildings and issues with dynamic collision with escaping personnel?
  8. Automatic safe flight route planning?

These are just very preliminary considerations. It is necessary to systematically evaluate and analyze all possible scenarios and parameters before making decisions on product specifications. Extensive evaluation experiments may also be needed (for example, the accuracy of visual, infrared, and radar non-GPS positioning in fire/smoke etc, and dynamic obstacle avoidance assessment).