Servers by jDrones

GSoC 2018: Complex Autonomous Tasks Onboard a UAV using a Monocular Camera


(Sepehr MohaimenianPour) #1

Nowadays using Convolutional Neural Networks for solving very complex computer vision problems is the new hype in the community, researchers are trying to equip CNNs for overcoming the problems that are too complicated to be solved using handcrafted methods. On the other hand, drones are getting very popular among researchers. This project is the intersection of these two.

In this project, we try to address the issues for designing and building a small form factor GPU enabled drone that is powerful enough to run a light-weight CNN onboard. This drone will further be used to perform some autonomous tasks using a monocular camera such as navigation and HRI.

For designing this UAV we have considered the following criteria:

  • Monocular camera: Although using depth information for controlling a UAV makes the life easier, because of the limitations of such sensors (i.e. depth estimation range limit, operation environment limits for structured-light RGBD sensors, huge data streams, etc.) we are more interested in using monocular cameras.
  • Embedded computation: It is possible to use an external powerful computer to do the computation offboard the UAV and send the control command to the UAV for execution. Our research group have previously used such a method for long range, close range, or even handsfree HRI using UAVs. We prefer to move the computation onboard the UAV so that it would be self-contained. The best currently available companion computer for this task is NVIDIA Jetson TX2 equipped with 256 Cuda cores, powerful enough to run a very light-weight CNN.
  • Small form factor: The proof of concept project in which a drone was equipped with a Jetson module was NVIDIA’s Redtail drone. Since then, many commercial products tried to provide a UAV equipped with a Jetson module. However, most are extremely big and heavy, these drones cannot be used easily and freely for research and cannot fly near humans by many countries flight regulations. We considered our design to be small and light weighted so that it can be considered under the light-weight hobby drones category and could be used easily anywhere other small form factor camera drones are allowed to, but powerful enough to carry the Jetson TX2, a high-resolution camera, and any other needed equipment.
  • ROS: ROS is becoming the standard framework for controlling Robots in academia and industry. We want our drone to be ROS enabled and have the ability to connect to the drone over WiFi to control it using ROS from an external computer if a higher computational power is needed.

The first phase of this project is designing and building a UAV considering all the aforementioned criteria. We then use the system to implement and perform some autonomous tasks.

In the NVIDIA Redtail project, the monocular camera trail following was done using PX4 firmware for the flight controller. The second phase of our project is to port the Redtail project on Ardupilot.

For our final phase, we are planning to use this system for another vision-based complex task. This phase consists of preparing a dataset, designing and training a CNN for the task, and compressing and optimizing the CNN model for the TX2 module.


Hardware design:

As mentioned before, we are trying to keep the design as small and light weighted as possible, simultaneously we need to make sure it is powerful enough to easily lift and carry all the required components for this project as well as some slack for extra unpredicted payloads if needed to be added later.

Here’s the list of selected components for this project:

  • Flight controller: Pixhawk 2.1 the cube
  • Companion computer: Nvidia Jetson TX2, which is equipped with 256 Cuda cores capable of running small CNNs. The Jetson TX2 development board comes with a big breakout board with exposes most of the available I/O’s and is good for development and testing purposes, however, this board is too heavy to fly especially on a small UAV. We used Auvidea J120 breakout board which is small and light enough to include onboard a small form factor UAV also exposes enough I/O’s we need for this project.
  • Optical Flow Sensor: Same as original Redtail project we use PX4Flow to use optical flow for controlling the drone in GPS denied environments. The original 16mm lens of this sensor is replaced with a wider lens to have a more stable optical flow, which also has a built-in IR filter to filter out the laser rangefinder’s IR.
  • Lidar: To compensate for shortcomings in PF4Flow’s ultrasound sensor for depth estimation and calculating the drone’s altitude, we use Pulsedlight single point laser rangefinder.
  • Motor: As we want to use small propellers (5"-6"), we need to compensate for smaller propulsion by increasing the motors RPM. Most of the High KV motors used in FPV flying are designed for extremely light drones, however, Tiger Motor F80 2500kv is designed for generating high thrust (up to 1640G on 6045 propellers) on a 4s lipo as well as being light weighted.
  • ESC: From the same company, F45A ESCs are designed especially for the aforementioned motors, they are very lightweight with low resistance, and provide 45A (55A burst).
  • Camera: A Logitech C922x Pro USB camera is used for high-quality, high FPS video stream, with built-in H.264 encoder.
  • Landing gear: We used DJI Matrice 100 landing gears which are pretty robust and durable, they are equipped with suspension systems to avoid damaging the drone on rough landings.
  • Power distribution board: The battery’s output is being monitored and limited by a 200A current sensor, also a 12V/3A regulator is used to isolate the J120 board’s input (although in J120 datasheet is mentioned anything between 7V to 17V is ok).

Mechanical design: As mentioned before, I tried to keep the design as small as possible, however, the frame is designed such that the propellers size can be increased up to 8" (with a small modification such as replacing the GPS, if for any reason we decide to increase the propellers size and decrease the motors speed).

I have tried to keep the area directly underneath the propulsion disk as clear as possible so that the motors can produce maximum thrust. Although the height of the drone is a bit high (I couldn’t fit all the components in a single level with the constraints I had in mind, such as size limit, and having the Pixhawk and PX4Flow’s lens exactly at the centre of the drone) the centre of mass is kept as low as possible.

The 3D printed vibration damper is unnecessary for Pixhawk 2.1 and could easily be removed.

P.S: This is my first mechanical design ever, your comments can help me learn a lot

I have to thank my mentors for this project, @jmachuca77, @khancyr, and @rmackay9 for their support, patience, and great feedbacks.


Update log:

  • 06/14/2018: Initial post and idea
  • 06/18/2018: Hardware design

(ppoirier) #2

@Voidminded pretty impressive challenge for a summer project, wish you great success.

As you probably know, Redtail is using a modified OpticalFlow (PX4FLOW) with wide angle for as a velocity estimator and a range finder (LidaLite V3) for a fixed altitude that correspond to the height of the images captured for training. Are you planning on using the same setup ?


(Sepehr MohaimenianPour) #3

Thanks @ppoirier.

For the Redtail project, we are planning to have the minimal changes. The setup is quite the same, I got the PX4Flow module with 2 wider lenses (3.8 and 6) to experiment with. Also using a laser scanner (PulsedLight instead of LidaLite because I already had one) for the altitude measurement.

I’ll update the post soon with design and components details, need to get some renders from the final CAD design first.


(Fnoop) #4

Hi, this is really interesting project, best of luck! Getting Redtail working on Ardupilot would be fantastic, and having an adaptable system for running CNN on a UAV would be really great.

One question I would have is: are you planning to use CUDA or other proprietary Nvidia software to accomplish this? Will it be possible to accomplish your goals using open software/protocols/libraries? Most hobbyists/prosumers can’t afford or don’t want to be locked down to Nvidia hardware. There are plenty of capable SBCs out there, even the humble Raspberry Pi is capable of simple CNN :slight_smile:


(Andrea Belloni) #5

Maybe a solution could be a Raspberry Pi with a NN accelerator as movidius https://developer.movidius.com/


(Sepehr MohaimenianPour) #6

Hi @fnoop,

For now, I’m going to stick to TX2 and NVidia JetPack for CNN computations, though we can move to smaller platforms and more opensource friendly/affordable solutions.

In this project, we want to be able to control a drone using only a monocular camera stream as the input. For doing so, the network has to run near real-time. A very small calculation can show that it is not possible to do so yet on a small SBC, Raspberry Pi per se. Let’s consider object detection for instance:

One of the smallest and fastest available object detector in the literature is tiny versions of YOLO which need ~5-6 GFLOP (double precision) per image. The TX2 module can perform ~1.5 TFLOP/s (single precision) or ~48 TFLOP/s (double precision). Theoretically, in the best case, it can run these models at ~8 FPS (not considering the memory and CPU speed which might become the bottleneck). Although it is not real-time for a 30/60 fps camera, we can still control a drone at this command/second rate. However, on RPi’s GPU we only can perform ~40 MFLOP/s (double precision), which is around 150s for a single image.

So it is not possible to control a drone in real-time using CNNs on a monocular image using smaller SBCs for now. Of course, one can design very very light-weighted CNN models that can run fast enough on a Raspberry Pi, let’s say using an array of laser scanner ranges as the input to the network, but definitely not on images.

But it is definitely an interesting idea to work on in the future and try to push the boundaries.

@anbello I have been following the Movidius for a while now, all I could find about the performance is that it can do up to 100 GFLOP/s (16-bit), but I have no idea about the double precision performance (64-bit). I really like to get my hand on one of those and test it out though!


(GuyMcCaldin) #7

This looks like it might have been based off my original design, and is almost certainly unnecessary with the internal dampening of the Cube.

It’s actually more likely to create issues than it is to help. A problem with the older designs when used with the Cube is that they create a relatively large moment arm between the IMU at the top of the Cube and the fixtures that translates rotational acceleration to lateral acceleration. For this reason, I don’t recommend using the design.

It sounds like a very cool project, please keep us updated!


(Sepehr MohaimenianPour) #8

Yes, the design is exactly inspired by your previous design. Based on @rmackay9 and @jmachuca’s suggestions I have removed it in the assembly.

Next update is coming up on assembly and tuning.