Easy way to integrate AI with ArduPilot? OAK-D! (Part 1)

When was the last time someone showed you a camera capable of object detection like this, without using a companion computer or any other external hardware? Today I’d like to introduce a camera capable of this, and much more!

For the past few months, I have had the opportunity to work on a fascinating new Camera. It is the OAK-D series from Luxonis. I have been experimenting with the original OAK-D and the OAK-D-IOT-75 for a while now. In this blog (Part 1 of a series of blogs I hope to post), I want to give a small description of them:

OAK-D:

This is essentially an RGB-D Camera (Stereo Camera + RGB Camera system, much like the Intel RealSense D435i). If you are not familiar with Stereo Cameras, they help us essentially generate a wide 3-D point cloud (3-D geometrical coordinates to the image on a real-world scale) via two or more cameras placed parallel to each other. This can be used for several applications, like Obstacle Avoidance. This technology isn’t new, and we have been working with the RealSense cameras for a long time.

What’s new about this particular camera is its ability to run AI/Computer Vision algorithms onboard the camera. As you may be aware, most applications in Computer Vision and Drones require a heavy (potentially expensive) Companion Computer onboard. This is a considerable bottleneck for any user to experience this technology. Well, not anymore! Things like detecting humans, everyday objects, animals can be done at a relatively high refresh rate inside the camera (with no additional hardware required, just any low powered USB-enabled companion computer, like an RPi Zero). Even training the camera to detect custom objects for your tailor-made applications are super easy!

What makes this sensor even better for ArduPilot is the fact that the detected objects can be tracked in 3D space via the stereo camera. So in essence, the camera outputs the detected object and its 3D coordinates to the Host (The device where the camera is connected too). Alternatives like the RealSense series had to transfer the entire Image frames onto the host, which required USB3 (Terrible for EM noise!) and a relatively fast Companion Computer. Other neat features include: It has onboard scripting capabilities (much like Lua scripting for ArduPilot), inbuilt IMU, object tracker support via inbuilt EKF and other algorithms, amongst many many more features.

I would also like to mention some of the issues that I have faced with the camera: There are issues with focusing on the RGB camera when it is put in high-vibration environments. This is being addressed by Luxonis by launching a “fixed-focus” version of the camera in the near future. A comparison of fixed focus vs auto focus can be seen here:

The other issue is that the quality of the Depth Frames isn’t as good as the RealSense cameras. They definitely have more noise. I have found workarounds, though. The Luxonis team plans to launch OAK-D-PRO, which should again improve this.

OAK-D-IOT-75:

This is the EXACT same camera as the OAK-D. Except: it has an ESP32 (with WiFi and BT) on board, which has a perfect use case for us ArduPilot enthusiasts! The camera can directly talk to the ESP32, and the ESP32 can be used to interface with the Flight Controller (via the serial port). This means we can finally have Computer Vision/AI applications without needing any Companion Computer! Directly plug in the camera to your Flight Controller and no need for any other equipment!

This is the camera that I have spent the last few weeks integrating and making new example scripts for. If you are interested in my development and integration work, please read the second part of this blog for some interesting applications!

13 Likes

This is really cool
I am going to follow this as object tracking has always been something I am curious about.

Great work

2 Likes

Hi, what does it mean “inbuilt”? How it is coded? Can it be modified and can you write your own code?

1 Like

Brandon from Luxonis/OAK team here. So it is a hardware-accelerated tracker internal to OAK which has the following tracker options here:
https://docs.luxonis.com/projects/api/en/latest/components/nodes/object_tracker/

I think we just added more options for the hardware acceleration, but I’m not sure so I’m asking the team.

3 Likes

What is the range of the camera? How high can the drone fly for the camera to detect an object’s position correctly before the results become completely unsatisfactory?

I am hoping somehow somewhere…one could actually assign the object to track. Perhaps it does that now. This is all new to me and most goes over my head but it is fascinating.

Yes, it is possible to do so. You can train it to recognize specific objects of your choosing and track them as well.

Thanks,
Brandon

Thanks Brandon.
What I mean is some how while say the drone is in the air you can assign it an object to track.
No clue how that would be done but that is what I am getting at.

1 Like

Hi @rickyg32 ,

I would give a try to do a dynamic list of classes, because the object tracker need a list of classes of objects to know what exactly must be tracked. You can have for instance 30 objects in your trained network, but when running inferences control the list of objects instead tracked 30 to 1 or 3 classes reading this list from a mavlink command or database or something.

1 Like

Since I’ve ordered OAK Lite, I’ve started to getting familiar with OpenVINO and I’m disappointed with the ONNX support, because the docs state that supported opset is ver 8 (which is far behind the current one - 15!).

1 Like

what is the power consumption of the camera? it looks like it has a lot of cooling.

@Luxonis-Brandon

I saw a post a few months ago where you and @rmackay9 were discussing the possibility of enabling this camera with the ability to do both VIO and obstacle avoidance. Is that still in the works?

1 Like

Yeah, that could be something. IMO MSCKF is the option here. Maybe some parts of existing KF tracker (gvatrack · openvinotoolkit/dlstreamer_gst Wiki · GitHub) can be reused or at least could be taken as example? To start work on it, it would be good to know how all dependencies between depthai and OpenVINO and where and how to plug it into existing pipeline.

1 Like

No every oak-d has IMU.

Depth-AI has already a feature tracker engine, you need the external software or a internal firmware I think. Many of that external software use Ceres,G2o,GTAM…and the one you named enter in that category I think, I don’t think is very probable that can run in a OAK camera.
About the internal option should be something like the feature tracker/slam of the T265.

About the internal firmware option I read something about that possibility soon , but Brandon is here so I guess he will give any news related with that .

1 Like

MSCKF (Multi-State Constraint Kalman Filter) is not optimization-based, so doesn’t need any optimization framework. See:
https://www-users.cse.umn.edu/~stergios/papers/ICRA07-MSCKF.pdf

2 Likes

Very nice comment. I’m going to try be a little more precise. The only reliable integration I know
of it is Openvins, which I use and is quite robust, which use Eigen3 and OpenCV, which place it as a external software. Obviously I guess you can integrate it in your own in a firmware but I would still consider a MSCKF integration should be done externally. In example S-MSCKF, which is very light and I guess could be more close to something running internally (t265 like) has a very poor performance compared with orbslam3, SVO,OpenVins or VinsFusion. I would not consider it for use it in a drone. Maybe there is some better integration with MSCKF than OpenVins, or similar in a very small size,but I don’t know it, and OpenVins get for me the minimum aceptable level of reliability to use it , so I wouldn’t in any case use something as S-MSCKF even if it can be executed internally, just like t265 is not acceptable for me (as main positioning sensor in a drone)

1 Like

I’m still not sure what could be done in OpenVINO on MyriadX, but in docs OpenCV is mentioned as a library for media processing and I saw OpenCV functions appearing in the codebase. If using OpenCV is possible then implementing VIO frontend and backend would be possible

https://docs.openvino.ai/latest/openvino_docs_gapi_gapi_intro.html

Thanks Brandon.
What I mean is some how while say the drone is in the air you can assign it an object to track.
No clue how that would be done but that is what I am getting at.

Yes, this is doable. We could make a demo if you’d like on how to do so. It involves making a UI that lets the operator select the object of interest and then the object tracker would follow it.

EDIT: We could help show how to do this on our Discord if that’s of interest: Discord

Thanks,
Brandon