Integration of ArduPilot and VIO tracking camera (Part 4): non-ROS bridge to MAVLink in Python

Nilantha_Saluwadana · November 13, 2019, 10:50am

Yes everything was fine and checked it was worked, but after set new parameters RC not working

LuckyBird · November 14, 2019, 9:05am

Hi @Nilantha_Saluwadana, if the problem is somehow related to the parameters I would suggest remove then add the parameters in one by one and see when will the problem happen. Maybe that would shed some light on the issue.

Hi @111182, as far as I know you would have to implement your own solution to switch between GPS-vision for navigation. One suggestion would be to first figure out which data from the GPS would indicate a need to change to vision positioning, then make a software switch in the data flow for the EKF.

Nilantha_Saluwadana · November 14, 2019, 12:19pm

Hi @LuckyBird,

Problem resolves It happened when SERIAL5_PROTOCOL set as 1 because I already used wifi telemetry, instead of that I used USB for companion computer problem solved, I did not change that value.

decades · November 24, 2019, 10:58pm

Hi @LuckyBird,
thank you very much for this extremely informative tutorial. May I ask you a question?

I honestly don’t completely understand your math in the script. I know, that it works completely fine, but I would like to understand, why

I’m especially referring to this part:

            # In transformations, Quaternions w+ix+jy+kz are represented as [w, x, y, z]!
            H_T265Ref_T265body = tf.quaternion_matrix([data.rotation.w, data.rotation.x, data.rotation.y, data.rotation.z]) 
            H_T265Ref_T265body[0][3] = data.translation.x * scale_factor
            H_T265Ref_T265body[1][3] = data.translation.y * scale_factor
            H_T265Ref_T265body[2][3] = data.translation.z * scale_factor

            # Transform to aeronautic coordinates (body AND reference frame!)
            H_aeroRef_aeroBody = H_aeroRef_T265Ref.dot( H_T265Ref_T265body.dot( H_T265body_aeroBody))

especially the last line

            # Transform to aeronautic coordinates (body AND reference frame!)
                        H_aeroRef_aeroBody = H_aeroRef_T265Ref.dot( H_T265Ref_T265body.dot( H_T265body_aeroBody))

Could you please elaborate? Especially the reason for the outer H_aeroRef_T265Ref.dot()?

Thanks in advance

LuckyBird · November 26, 2019, 12:18pm

Hi @decades, thank you for your interest in the code.

First, let’s discuss what we want to achieve before delving into the code. This is a snippet from the blog post, so you can refer to the main discussion in the blog as to why and how the calculations come about.

Our goal is to find the homogenous transformation matrix (4) in the following picture:

A homogenous transformation matrix (referred to as matrix from here on) combines both the rotation and position into one 4x4 matrix:

In short, what we do in the code is first setting up the matrices (1), (2), (3) then multiplying these matrices (dot product of arrays in numpy) in the right order.

Now for the code:

H_T265Ref_T265body corresponds to the matrix (2), which is:
- the homogeneous transformation matrix (H)
- in the reference frame of the camera (T265Ref)
- of the T265’s tracking pose (T265body).
The matrices (1) and (3) are H_aeroRef_T265Ref and H_T265body_aeroBody, respectively, and are defined in the earlier part of the code. You can figure out the meaning of their names in a similar way to (2). Their values depend on the initial orientation of the camera and do not change throughout the operation.
The data variable is the 6-DoF pose data that we obtained from the T265 camera. Once the latest pose is available, we extract the relevant information (position + rotation) and stuff them into the matrix (2), or H_T265Ref_T265body:

# In transformations, Quaternions w+ix+jy+kz are represented as [w, x, y, z]!
H_T265Ref_T265body = tf.quaternion_matrix([data.rotation.w, data.rotation.x, data.rotation.y, data.rotation.z]) 
H_T265Ref_T265body[0][3] = data.translation.x * scale_factor
H_T265Ref_T265body[1][3] = data.translation.y * scale_factor
H_T265Ref_T265body[2][3] = data.translation.z * scale_factor

The quaternion_matrix function creates a 4x4 matrix with the rotation part, then the translation part is added in the last column.
The scale_factor is introduced in part 5 of the series. By default its value is 1 and you can ignore it here.
Finally, we can realize the equation (4)=(1)x(2)x(3) with a concatenated matrix multiplication (dot product). Read from left to right, it’s equivalent to (1)x(2)x(3):

H_aeroRef_aeroBody = H_aeroRef_T265Ref.dot( H_T265Ref_T265body.dot( H_T265body_aeroBody))

decades · November 26, 2019, 12:49pm

Hi @LuckyBird,

Forgive me if I seem like the last fool here. I think I understood what the code does. In this respect, your description coincides with my vague understanding. What I don’t understand is where these initial values come from, how they came about and what they actually mean

Let’s talk about ‘H_aeroRef_T265Ref’ and ‘H_T265body_aeroBody’: Sure, the latter is the inversion of the former. So far so clear and understandable in the code. But where does H_aeroRef_T265Ref come from and what does it mean? I think, if I have understood that, the rest becomes clear to me Or I have new questions

Thanks so far

decades · November 26, 2019, 12:50pm

BTW: You seem to be someone who is very precise in his work and follows it with enthusiasm. I like that very much. Thanks for that. Not too self-evident these days

ppoirier · November 26, 2019, 1:00pm

If I may suggest you to look at the source of this code, this is from Intel RealSense repo. and this is a transformation matrix from the T265 camera frame that is not a standard camera frame (it is a VR frame):

github.com

IntelRealSense/librealsense/blob/master/wrappers/python/examples/t265_rpy.py

#!/usr/bin/python
# -*- coding: utf-8 -*-
## License: Apache 2.0. See LICENSE file in root directory.
## Copyright(c) 2019 Intel Corporation. All Rights Reserved.

#####################################################
##           librealsense T265 rpy example         ##
#####################################################

# First import the library
import pyrealsense2 as rs
import numpy as np
import transformations as tf
import math as m

H_aeroRef_T265Ref = np.array([[0,0,-1,0],[1,0,0,0],[0,-1,0,0],[0,0,0,1]])
H_T265body_aeroBody = np.linalg.inv(H_aeroRef_T265Ref)

# Declare RealSense pipeline, encapsulating the actual device and sensors
pipe = rs.pipeline()

This file has been truncated. show original

decades · November 26, 2019, 1:22pm

@ppoirier Hi Patrick, nice to meet you again. Thanks for the suggestion. My particular problem is to understand what it means I see it, but I don’t currently get it…

If I put the rotation matrix contained in

H_aeroRef_T265Ref = np.array([[0,0,-1,0],[1,0,0,0],[0,-1,0,0],[0,0,0,1]])

into this calculator (in order to understand, what it means)

https://www.andre-gaschler.com/rotationconverter/

then it says, there is a rotation around x by -90 degrees and a rotation around y by -90 degrees too. Is that correct? The rotation of the aeronautic reference frame into a T265 reference frame? Is it that?

If so, then I would understand, how one can come from the aeronautic reference frame (x forward, y right, z down) to the T265 reference frame (x right, y up, z backward) by rotating first x RIGHT (if -90 is a right rotation) and then the new y RIGHT by 90 degrees too. This would make sense, if it would be so. Is it so? And the inverse would surely be the other way around, the rotation of the T265 reference frame into an aeronautic frame.

Please confirm if my understanding is correct.

LuckyBird · November 26, 2019, 1:42pm

@decades so I misinterpreted your question a bit.

Your interpretation of H_aeroRef_T265Ref is correct. We need a transformation such that aeroRef will coincide with T265Ref. Then H_T265body_aeroBody is the inverse.

Do note that the x and z axis of T265Ref frame is defined at the moment the system starts, at stated here.

Also you should definitely check out the link from @ppoirier 's comment above as it is the original code that I based my code upon.

ppoirier · November 26, 2019, 1:45pm

And be careful with the online tools as the references may vary, in this script we are working with https://pypi.org/project/transformations/ and you have to use the references and methods.

Fun Fact; during development I was using an outdated version of this transformation library and I lost 3 weeks and part of my sanity …

decades · November 26, 2019, 2:11pm

@ppoirier Thanks. I can empathize with that. I’ve never felt as stupid as the last few days when I tried to understand that. There are so many explanations in the net, until you have the eye, which ones are right and which ones aren’t, you can be ready for the loony bin

decades · November 26, 2019, 2:20pm

@LuckyBird Thank you so much. You make my day. Yes, I know this example. In fact, I wrote a PR for librealsense that does the whole thing with a simplified approach. On the one hand, because I had no idea what was actually happening in the original example, on the other hand, because I needed it for C++ and there is a lack of a number of the libs you used. My approach was a purely deliberational one using three glued together colored crayons

Here is my solution.

decades · November 26, 2019, 2:46pm

One additional question: You mean this part with your comment regarding the x axis, right?

When T265 tracking starts, an origin coordinate system is created and RealSense SDK provides T265 poses relative to it. World’s Y axis is always aligned with gravity and points to the sky. World’s X and Z axes are not globally set, but determined when tracking starts based on the initial orientation of the T265 device with the initial -Z world axis as the projection of the camera axis to the ground plane. This does mean that the initial yaw can seem random when the device is started in a downward facing configuration, say on the drone. All T265 (and librealsense) coordinate systems are right-handed.

How would that T265 reference frame look like, if the cam starts pointing downwards? Y is always towards the sky, but where would T265 x and z pointing too? I’m currently not having the imagination power to see this TIA

LuckyBird · November 26, 2019, 3:28pm

For downward-facing case, the yaw of the T265 reference coordinate will be random (see here).

The current solution to obtain a somewhat consistent yaw is to tilt the nose of the vehicle up a little bit (so that the camera is not completely flat) when the software starts running. After that, the T265 reference frame will be similar to the forward-facing case.

The H_T265body_aeroBody is, however, not simply the inverse of H_aeroRef_T265Ref anymore, but can be calculated by adding one 90 deg rotation around the x axis to the one in forward-facing case.

decades · November 26, 2019, 4:50pm

Yes, I have read about the tilting trick already in the librealsense forum. Thanks for the explanations.

Thanks again for your patience and your work. Hope to see you again somewhen somewhere.

decades · November 26, 2019, 10:52pm

OT: I’m having a little other problem in the same context. Would you allow me to ask you for a pointer?

LuckyBird · November 27, 2019, 1:14am

@decades Sure. Ask away!

decades · November 27, 2019, 9:01am

Thank you for your interest. Well, the following situation: Imagine, there would be a T265. This starts at P(x=0,y=0,z=0) in the T265 reference frame. At a certain point in space and time it has reached the coordinate M(x=mx, y=my, z=mz). What I want to achieve now is that for all following coordinates provided by T265 the point M is the new coordinate origin. From the consideration I imagine that I would only have to subtract the fixed translation tM and rotation rM of M related to the T265 reference frame from each new point Q in the T265 reference frame. At least I’m sure about the translation component. So tQ’ = tQ - tM should bring me the new translation tQ’ for Q’, now related to point M. I’m not so sure about the rotation or I’m making the same mistake again and again. I think I would have to remember the rotation rM of M over all axes related to the T265 reference frame and then subtract it from the rotation angles rQ to reach rQ’ as well. rQ’ should then represent the rotation of Q related to M. Is this correct? If so, would there be an elegant quaternion operation that determines this delta and applies it subtractively?

I guess, the transformation of rQ’ into NED/Euler aeronautic could NOT use the known transformation matrices and multiplications anymore in the end? So it appears to me better to first transform all coordinates into the NED frame to make the math there?

I hope I was able to present the problem.

LuckyBird · November 27, 2019, 1:27pm

Here are my 2 cents, hope this time I understood your question right:

To find the coordinate of a new point in the new frame, I believe what you described, i.e. subtract the translation and angles, should help you achieve the desired result.

The representation of the data plays a big role in this. Treating the translation and rotation separately would not be the least troublesome solution in my opinion.
What I would recommend is to use the 4x4 homogenous transformation matrix, so both the translation and rotation are dealt with at the same time.

https://i.imgur.com/oirL2NE.png

Following your notation, we will need to multiply the inverse of the matrix at M with the matrix at Q to find Q’ - the relative translation and rotation of Q in the frame defined at M. The order of multiplication is important.

    matrix_of_Q_in_M = matrix_of_Q_in_P * inverse(matrix_of_M_in_P)

Finally, the translation and rotation can be extracted from the output matrix matrix_of_Q_in_M .
Now you can implement it with your favorite language. First stuff the data in a 4x4 array/matrix, then perform multiplication in the above order. In C++ this can be done quite easily with Eigen library. In Python the linalg.inv() in numpy is what I am using.

If all of the transformation between the frames are known, I think performing the calculation before/after converting to aeronautic frame can be handled so long as everything is kept tidy and every line of code is correct. This might require a level of sanity that I don’t possess, though.