Project: Rough terrain navigation using Deep Reinforcement Learning Part 1

Hi everyone,

I am an Australian Mechatronics Engineering Student and kicking off the project for my undergraduate thesis. I’ll be incorporating Autopilot rover to control an eight wheeled rough terrain robot and develop a Deep Reinforcement Learning algorithm to train the robot to choose the optimal path when navigating in challenging environments like rubble.

The plan is to use a standard GPS module to control the robot during normal operation and a RealSense depth + tracking camera combo to detect obstacles and switch to the machine learned close-in navigation policy. In this mode the depth camera can generate a terrain level map and roughness estimation factor which can be fed into the DRL policy along with the robots current and goal position to determine intermediate positions to reach along the path. These positions can then be sent to the mission planner as new waypoints for the flight controller to aim for (using the tracking camera).

I am still fairly new to ArduPilot, so any comments or tips on my setup are extremely welcome. The main equipment being used in the project is:

  • 8-wheel robot with motor driver board compatible with rover Autopilot signals
  • Pixhawk flight controller
  • GPS module
  • Intel RealSense D435i depth camera
  • Intel RealSense T265 tracking camera
  • NUC companion computer
  • Telemetry radio
  • PC ground controller with MAVLink protocol

This is a general software overview to show how the DRL path planner will fit in with the robot’s operation:

Below is a full wiring diagram of the motor controller board, computer vision system and the autopilot equipment.

Hopefully everything will work as intended, I aim to have the system functioning properly in normal mode (just GPS and Pixhawk) before I start designing the DRL algorithm. I’m sure there will be lots of changes as I begin to piece everything together and test but I’m looking forward to seeing the performance of the final system.

Up next I will be documenting the assembly process of the robot and the first round of operation tests. Once again any feedback or questions are welcome so let me know what you think.



Hi Tom, great stuff. I hope it’s OK, I moved one of the images to the top to make it more attractive when viewed on

1 Like

Thanks @rmackay9, thats all good!

Really interesting project and very current.

Would you not be better off connecting both the GPS and the tracking camera to the companion computer and then just sending a single serial stream to the Pixhawk as a GPS signal, so switching between tracking cam and GPS is invisible?

I think you will have to drop out of autonomous mode and into manual via Mavlink to get close in navigation. All in good time I guess.

Good luck. Keep us informed on progress.

1 Like

Perhaps instead of switching between manual and auto mode while heading to a waypoint for close in navigation or creating and writing entirely new waypoints mid-mission, you could take advantage of guided mode or the SET_ATTITUDE_TARGET command.

1 Like


This sounds like a really cool project. I’ve used the t265 on both rovers and multirotors for localisation in environments with and without GPS with quite stunning results - check this out: Integration of ArduPilot and VIO tracking camera (Part 4): non-ROS bridge to MAVLink in Python - #22 by hugh.

I’ve got plenty of experience with both cameras and companion computers in rovers so don’t hesitate to reach out if you need some help!

1 Like

Why are you trying to use RL for the problem of collision avoidance that has plenty of robust methods and for sure DRL will be sub optimal here. RL (model free in your case I guess) is for use cases when the vehicle dynamics, environment etc. are unknown or so complicated, that the only solution is to learn by trying. You’ll have to go through simulation, sim2real issues, and basicly recreate the whole environment, agent behaviour and their interaction (= reinvent the wheel)

Thanks @hugh, I’ll definitely be using that guide get the tracking camera up and running and come to you when I get stuck. Have you used the depth cameras much in your projects?

@soldierofhell I agree there are definitely better and simpler ways to tackle collision avoidance but there’s also the matter of autonomous path planning to consider which can be a bit more complex. I think it would be interesting to see how a machine learned path planner would perform in particularly challenging environments (such as a massive pile of rubble where there is no clear optimal path) and it could make for some great discussion in my thesis.

I am also quite interested in the reinforcement learning topic itself and this project is a way to get some practice and set up the framework for a future project I am planning, which is to make a footstep/path planner for a legged robot like a hexapod, and make it navigate through a similar terrain environment.

Hopefully the sim2real issues aren’t too severe, the rover is quite solid and deterministic so it should be able to be modelled realistically in the simulation. At this point I’m not even sure how much simulation training will be done, as it might be easy enough to develop a good policy (or improve an initial policy from simulation) with some short real world training time.

@ausdroid @hugh thanks for the suggestions, I’ll have more of an idea of how the communications with the flight controller work as I set up and begin testing the basic autopilot system, and can set up the companion computer to handle more of the position and navigation target messages.