GSoC24: Visual Follow-me using AI

khanasif786 · June 19, 2024, 8:41pm

Introduction
Hello, I am Asif Khan from Jaipur, India. I am extremely delighted to share that I have been selected as a contributor in GSoC, 2024 in Ardupilot with @rmackay9 and @peterbarker as my mentors. I will be working on Visual Follow-me using AI project this year. In this blog i will be discussing my project in detail.

Scope Object tracking
While drones are increasingly utilized for dynamic tasks such as sports filming, event coverage, and surveillance, There is a lot of scope in field of object tracking using a Camera. The ability to accurately follow and capture moving subjects opens up new opportunities for advanced applications in search and rescue missions, wildlife monitoring, and autonomous delivery systems. Enhanced object tracking capabilities can also improve security and traffic management, providing real-time data and insights.

Problem Statement
Majorly this project is divided into 2 parts Those are:

1. Adding Visual Follow-me support for all ArduPilot camera gimbals

The gimbal will follow the target the process is as follows-

The GCS will send a MAV_CMD_TRACK_POINT or MAV_CMD_TRACK_RECTANGLE messages to the FC.
The FC will then pass the message through a set of libraries and will forward the request to companion computer.
The companion is responsible to track the point and send the roll pitch and yaw commands. a separate controller script will handle the control algorithm.
The commands thus got will be fed into the fc in form of mavlink messages to control pitch and yaw and would try to put the object into the image center. see here the message definition GIMBAL_MANAGER_SET_PITCHYAW

2. Making the vehicle to follow the object
In this the drone will follow the target object. ArduPilot already supports calculating the lat, lon and altitude of what the camera gimbal is pointing at. So by using that we can easily get the lat lon and alt of the target, and feeding that information to the follow mode will eventually make the drone to follow the point.

Here is an ideal message flow diagram which may be helpful to understand the flow (this may change a bit with time).

Now to the Companion side the object tracking algorithm will be working with a PID controller calculating the pitch and yaw rate, For object tracking Yolov8 or OpenCV can be used, both are compatible with development boards which supports Linux like nvidia jetson series, RPi etc. The basic controller that i proposed will work like this -

dX is the error in the X axis and dY is the error in the Y axis. Those will be the input to a separate controller dedicated for tracking. Here dX will be input for Yaw control and dy will be input for Pitch control.

What if an object goes outside the image range ?
If it happens the last object coordinates will be used to move the gimbal but within a constant time difference. If the Object is found again the tracking will continue. But if it’s not found within a search limit then the gimbal will be stopped at the last location statically.

Well thanks for reading the blog this far. I would like to receive more suggestions and insights, If you want something to add let me know by providing your valuable feedback.

ThankYou!

lida2003 · June 20, 2024, 2:46am

That’s great to read GSoc24 project of Visual Follow-me using AI.

I have a couple of questions about this (Hope to hear from u):

What GCS are you using for part one, does latest QGC 4.4.0 support target selection?
How does drone plan to follow the target, in X-Y surface (height maintains the same level)?
Does Raspberry Pi3B+ suffice to do the job(follwing the target)?
What hardware configuration are you using for developing (I have a Pi3B+ copter)?
Is there any simulation environment setup guide for test?

EDIT: Here is the link for AI module from SIYI: SIYI AI Tracking Module 4T Computing Power Human Vehicle Multi-Target Recognition Anti-Lost

khanasif786 · June 20, 2024, 8:31am

Hello Daniel,
Thanks for reading, Its just an initial post so i am still determining a lot of components and approaches so the answers here may change with time, lets try to answer one by one:

The feature is in the master will have to check if they have released.
The Follow mode does have some functionalities to control the altitude, in X-Y plane it will follow a target just using the passed lat-lon, but with an offset defined by FOLL_OFS_X, FOLL_OFS_Y, FOLL_OFS_Z, you can have a look here.
I have to check the performace on the given hardware, however i have planned to use a nvidia orin nano. However i will definately check the performace of the model on different boards i have to borrow some Pi boards for it.
I am using a Orin nano.
Yeah, as i have discussed @rhys we can easily setup the things on simulation using gazebo harmonic i will update here once i setup initial patch here.

lida2003 · June 20, 2024, 9:23am

I think SIYI QGC has this function, but I’m NOT sure if the code is on github

Please evaluate those Pi boards if you have the time. I hope Pi3B can do the job.

Yes, please give an update.

And thanks for you quick response. Hope to see the Visual Follow me!

.

geofrancis · June 20, 2024, 10:16am

I have been using a AMB82-mini for object tracking on rover, it has a built-in NPU so it doesn’t need a companion.

geofrancis · June 27, 2024, 1:46am

where can i find this?

rmackay9 · July 6, 2024, 1:46am

Hi @geofrancis,

Sorry for the slow reply, I didn’t see your question.

We’ve got the calculation in two places

lida2003 · July 11, 2024, 6:05am

I have installed QGC 4.4.0, and I didn’t see any icons related to object selection. Any idea?

I think AI tracking is kind of different from FollowMe mode. It might be some thing like below sequence(I’m NOT expect on this, I’m just learning, anything wrong, please let me know.):

GCS select object detection area //MAV_CMD_CAMERA_TRACK_POINT/MAV_CMD_CAMERA_TRACK_RECTANGLE
GCS send selected aera through MAVLink to FC
FC transmits packet to AI module
AI module detect object and feedback
4.1 AI module feedback result to FC
4.2 FC transmits packet to GCS
AI module add an OSD box, indicating object is detected, which is show on GCS
AI module execute algorithm to pitch/roll/yaw, tracking the moving object
If there is an abnormal situation, such as object misiing, stop tracking // MAV_CMD_CAMERA_STOP_TRACKING

OK, Orin has 40TOPS, which is better than SIYI AI 10TOPS.

Any updates? I hope to setup an environment also, but … … If you have this setup, please let me know, thanks.

Kot_Lessay · July 13, 2024, 1:05am

Hello! Great article!
I would like to contact you, we have a project for autonomous fire extinguishing from drones
You could help the project, what you write in your article are exactly the problems that I want to solve
You can write to me by email - kavaneural@gmail.com

geofrancis · July 13, 2024, 9:53am

This is a public forum, if you want help then ask publicly, no one is going to work just for you for free.

Kot_Lessay · July 13, 2024, 12:40pm

Of course not for free, what are we talking about, I didn’t even think about that
The task is the same as described in the topicstarter
But you need to automate the drone when extinguishing a fire. For example, it may lose contact when heated and must be raised automatically or moved away

lida2003 · July 13, 2024, 9:19pm

@Kot_Lessay I think the project “autonomous fire extinguishin” you are doing now is more complex than GSoc24 Visual Follow-me using AI. Is it possible to create a new topic for further discussion if you need thoese experts to help?

As there are quite a lot of scenarioes which need to be consider:

What type of fire extinguishing equipment should be carried?
In what scenarios (oil, gas, etc.) is it suitable for fire extinguishing?
How efficient is the fire extinguishing?
What is the heat resistance capacity of the entire equipment?
During flight, when there is smoke, what is the effectiveness of visual cameras, infrared cameras, and radar/lidar?
How long can the equipment operate while carrying it?
What about fire rescue in environments like narrow spaces in buildings and issues with dynamic collision with escaping personnel?
Automatic safe flight route planning?

These are just very preliminary considerations. It is necessary to systematically evaluate and analyze all possible scenarios and parameters before making decisions on product specifications. Extensive evaluation experiments may also be needed (for example, the accuracy of visual, infrared, and radar non-GPS positioning in fire/smoke etc, and dynamic obstacle avoidance assessment).

lida2003 · July 16, 2024, 5:42am

@rmackay9 @khanasif786

I have seen GSoC Visual Follow-me funding request

Does this means It has to be Smart controller (Herelink, Siyi MK15, etc) . And QGC from GitHub - mavlink/qgroundcontrol: Cross-platform ground control station for drones (Android, iOS, Mac OS, Linux, Windows) doesn’t support object selection?

khanasif786 · July 16, 2024, 6:36am

Yes you are right, my approach is a lot similar to you.

khanasif786 · July 16, 2024, 5:23pm

yeah it will be a smart controller, but eventually any of the QGC patch will work, i have to make some additions for the QGC part. For testing the gazebo part is ready i will post the version and how to setup that here.

lida2003 · July 16, 2024, 8:59pm

As right now, I don’t have Smart controller (Herelink, Siyi MK15, etc). I just installed mavlink/qgc on laptop or android device. I didn’t find the menu to send those command.

mavlink/qgroundcontrol should support this, code(MAV_CMD_CAMERA_TRACK_POINT MAV_CMD_CAMERA_TRACK_RECTANGLE) are there. I’m NOT sure what I have got wrong, hope to see your version of QGC and setup:

github.com

mavlink/qgroundcontrol/blob/fdffbcd7ff5f19db0ebfb574a9e012a35b3cd633/src/Camera/VehicleCameraControl.cc#L2338


      
              if(_trackingPoint != point || _trackingRadius != radius) {
                  _trackingPoint  = point;
                  _trackingRadius = radius;
          
                  qCDebug(CameraControlLog) << "Start Tracking (Point: ["
                                            << static_cast<float>(point.x()) << ", "
                                            << static_cast<float>(point.y()) << "], Radius:  "
                                            << static_cast<float>(radius);
          
                  _vehicle->sendMavCommand(_compID,
                                           MAV_CMD_CAMERA_TRACK_POINT,
                                           true,
                                           static_cast<float>(point.x()),
                                           static_cast<float>(point.y()),
                                           static_cast<float>(radius));
              }
          }
          
          //-----------------------------------------------------------------------------
          void
          VehicleCameraControl::stopTracking()

github.com

mavlink/qgroundcontrol/blob/fdffbcd7ff5f19db0ebfb574a9e012a35b3cd633/src/Camera/VehicleCameraControl.cc#L2315


      
              if(_trackingMarquee != rec) {
                  _trackingMarquee = rec;
          
                  qCDebug(CameraControlLog) << "Start Tracking (Rectangle: ["
                                            << static_cast<float>(rec.x()) << ", "
                                            << static_cast<float>(rec.y()) << "] - ["
                                            << static_cast<float>(rec.x() + rec.width()) << ", "
                                            << static_cast<float>(rec.y() + rec.height()) << "]";
          
                  _vehicle->sendMavCommand(_compID,
                                           MAV_CMD_CAMERA_TRACK_RECTANGLE,
                                           true,
                                           static_cast<float>(rec.x()),
                                           static_cast<float>(rec.y()),
                                           static_cast<float>(rec.x() + rec.width()),
                                           static_cast<float>(rec.y() + rec.height()));
              }
          }
          
          //-----------------------------------------------------------------------------
          void

khanasif786 · July 21, 2024, 9:13pm

yeah the thing is some specific PX4 messages enable that if i am not wrong. We need the same functionality running for ArduPilot.

Davis_Kipchirchir · July 21, 2024, 9:28pm

I might be new to this but have there been considerations of using YOLO? This does not mean I do not trust Ardupilot’s inbuilt ability to calculate latitudes and determine the geolocation though.

Oli1 · July 21, 2024, 10:26pm

YOLOv8 is mentioned in the OP.

lida2003 · July 22, 2024, 2:16am

Maybe. I’m NOT sure if current QGC version support this AI tracking target command.

But it seems it’s supported by SIYI QGC, which has been confirmed by SIYI support(SIYI MK32 Enterprise Ground Station - 7" Display, 15KM Range, 4G+64G Android, Dual Operator, Abundant Interface - #177 by SIYI).

github.com

x-tools-author/siyiqgroundcontrol/blob/583c15aa346518a80cafa46b8cb8993826bc74ca/src/SiYi/SiYiCamera.cc#L272


      
              QByteArray msg = packMessage(0x01, cmdId, body);
              sendMessage(msg);
          }
          
          /**
           * @brief SiYiCamera::setTrackingTarget 设置(取消设置)AI追踪目标
           * @param tracking
           * @param x
           * @param y
           */
          void SiYiCamera::setTrackingTarget(bool tracking, int x, int y)
          {
              uint8_t cmdId = 0xAA;
              uint8_t trackAction = tracking ? 1 : 0;
              uint16_t trackX = x;
              uint16_t trackY = y;
          
              QByteArray body;
              body.append(reinterpret_cast<char*>(&trackAction), 1);
              body.append(reinterpret_cast<char*>(&trackX), 2);
              body.append(reinterpret_cast<char*>(&trackY), 2);

@khanasif786 are you going to use Siyi MK15 for GCS?