Pixhawk Flyaway, multiple times in a row

Flying a TBS Discovery Pro with AC 3.2, GPS/compass: uBlox NEO-M8N

Today I experienced, for the first time, multiple repeated flyaways on my otherwise very stable Pixhawk.

-54 flights, with this combination of components so far
-7 of those yesterday without issues

All that I changed from yesterday was I reduced the Stabilize Roll/Pitch values from 13.0 to 7.0. as I briefly noticed some large oscillations at high speed in loiter yesterday. Also reduced RC feel from 40 to 25 for smoother video.

Today I went into loiter for around 2 minutes to confirm control responses and stable GPS lock, then started a mission. The quad immediately and unexpectedly flew off in the wrong direction.

After it flew 100m away, low and fast and showing no signs of recovering, I flicked into stabilize, regained control and flew it back without any issues.

I recalibrated the compass on the spot, also did Compass/Motor calibration and took off again to retry.

Put it in loiter, it was in a nice and stable hover at first, then started randomly jerking/tilting a few times per minute. It would also drift, usually towards the east, regardless of it’s own orientation.

Then I ended the flight by testing RTH, it strangely returned and landed exactly in it’s home location without drifting or other issues.

It doesn’t seem to be a GPS, vibration or mechanical issue. Can a bad compass cause this behaviour?
I can’t seem to get EKF3 IMX/IMY/IMZ to align as recommended, even when performing calibration outdoors…

Very thankful for any advice that anyone has… log of the first flyaway and the second retry is attached.

Just noticed that EKF4 SP (GPS position inconsistency) is totally out of bounds, supposed to be below 1 but in my case it jumps between 300 and -300.

Also, whenever in loiter, NTUN Desired PosX/Y spikes at a very regular rate, looks like every 10 seconds…?

I have no idea what caused this - as mentioned above, RTL worked, and the KMZ file is showing a clean, smooth track, which includes the actual flyaway, so the GPS seems to be working fine.

In this case it should be the IMU that is providing false data, but both IMU’s are logging similar values, so that can’t be it either…?

Here a video of the incident:

0:09 Activate Auto flight mode
0:19 Flyaway occurs during right yaw
0:26 Recovery with switch into Stabilize

[youtube]https://www.youtube.com/watch?v=rqN2j-gERXQ[/youtube]

Ok, so you’re using the new M8 GPS, I noticed the excellent HDOP and #Sats. This may be relevant, we’ll see.

Vibrations look good. Dynamic control looks good.

There’s some definite odd-ness in the NTUN messaging on the first log. Desired velocity of 40,000,000 m/s, but that might just be an error from the .log conversion process (native .bin logs from the card are best). DPosX and DPosY also have some weirdness. Unfortunately, there is no Mag data.

Second log, again, a lot of strangeness in the NTUN messages, though these are .bin, so wouldn’t appear to be log corruption. There’s lots of crazy spikes visible in DPosX and DPoxY, and PosX and PosY, and this is what’s driving crazy desired accelerations, and resulting in the jerking you’re seeing.

I’ll elevate this to Randy.

Unfortunately my curiosity got the best of me and the quad crashed while I was attempting to collect more data with the “Nearly All” log bitmask.

It seems the issue is a bug caused by my flight plan.
It is triggered when switching to Auto.
It affects all flight modes with GPS thereafter until a restart of the board.

Mission Planner version: 1.3.15.3 build 1.1.5453.27745

Flight 1:
Tested Loiter & RTL for a few minutes - stable hover & normal flight, no issues. After switching to Auto, quad flies away in the same direction as yesterday. Recovery with Stabilize. Switched to Loiter, quad starts drifting and jerking regularly. PosHold and RTL are also affected.

Flight 2:
After landing, replacing battery and taking off again, quad is stable in Loiter, no issues.
Switch to Auto -> Flyaway. Attempted recovery with stabilize, quad started tumbling and falling. It sounded and looked like it was trying to stabilize itself, but failed. It crashed upside down.

Files:
Flight 1: dl.dropboxusercontent.com/u/895 … -12-59.bin
Flight 2: dl.dropboxusercontent.com/u/895 … -27-56.bin
Flight plan: dl.dropboxusercontent.com/u/895 … n.txt?dl=1

Thanks for the report. You’ve certainly stumbled upon a bug although it’s a bit of an edge case.

The short answer is that the navigation controller doesn’t handle the do-set-home mission command properly. What happens is it start trying to fly to a location which is the target location (i.e. waypoint) plus the position difference between it’s arming location and the location specified by the do-set-home command.

Now in the mission extracted from your logs it looks like the do-set-home command has all zeros for the lat, lon and alt. You might think that this would initialise the home location to the current location but according to the spec, that is accomplished by setting the first parameter to “1”.
So in fact, this command is setting the home location to be off the west coast of Africa.

These two things together meant that the copter was trying to fly off many thousands of kilometers from your current location.

I’m going to update the “Mission Command” wiki page and we will fix the underlying bug for AC3.3.

Thanks again for the report!

Thank you so much for the quick analysis, bug find, and rapid GitHub report, this is very much appreciated!

There are three unanswered questions though:

  1. Yes, the flyaways for RTL were towards SW, which is in the general direction of Africa from here. But the initial flyaway caused by Auto was towards the opposite direction NE?

  2. How is it possible that this issue can afterwards affect unrelated flight modes Loiter and PosHold (drifting and jerking)?

  3. How did last switch into Stabilize to recover cause instability and a crash? (see end of log of Flight 2)

It seems as if the wrong home location confuses the controller to the point where it doesn’t function properly until restarted.

See this video for examples (from Flight 1 a few hours ago):
youtube.com/watch?v=YeQhQZOXBTU

04:22 Activated Auto - Flyaway (heading NE)
05:02 Activated RTL - Flyaway (heading SW)
07:14 Loiter - jerking & slow drift towards NE

Mark,

I can’t definitely answer all three questions but here are my guesses:

  1. it’s possible for the RTL and waypoints to fly off in opposite directions. The reason is that RTL uses a (0,0) offset-from-home as it’s target while the waypoint uses a lat+lon location that’s converted into an offset-from-home. The bug could easily be in the conversion of the lat/lon to the offset-from-home.

  2. moving the home position very far from the vehicle’s current location could cause inaccuracy in the position control because we use four byte floats. That means we have 7 or 8 digits to record the position as an offset-from-home. If the vehicle is very far from home, most of those 7 or 8 digits are consumed.
    I.e. if the vehicle is about 1m from home, the position controller could concievably measure that it’s 1.000001m from home (i.e. accurate to 0.000001m) . But if it’s 32,000km from home then it can only measure it’s position as 32000.01 meaning that it’s only accurate to 0.01m. In fact, we measure the position from home in cm so it’s 100x worse than my example above.

  3. the failure in stabilize looks unrelated to me. From the logs we can see the desired roll and actual roll diverge which is just what we see cases of an ESC or motor failure. I.e. the flight controller wasn’t commanding the vehicle to roll-over.

Thanks for the answers, very interesting. I’m starting to realize the complexity of the software and that there’s a lot to learn.

I still suspect the crash was due to software - I just ran all motors for a minute, also at full throttle - no vibrations/strange sounds, no loose connectors, no visual damage to coils, ESCs are at a normal temperature, nothing to indicate a mechanical failure. I’ll actually continue using them.

Only the gimbal roll motor is not functioning anymore :slight_smile:

In the log of Flight 2 RC Out channels 1-4 all converge to an identical path exactly at the moment where the quad started falling to the ground:

Aren’t they supposed to be wildly active, indicating attempts to stabilize itself?
Also, all channels seem remain at 1130, as if there’s no attempts to stabilize at all right before impact…

Is this because the flight controller possibly went “bonkers” due to the bug?

Thanks again to everyone for your time. Despite the complexity and the multiple crashes, I’m very happy I switched from NAZA and am enjoying the freedom of being able to configure the quad exactly how I want it.

Mark,
That’s a nice catch, I hadn’t noticed the motors all went to the same value. That’s very strange and it very likely means the software was the problem. Perhaps it was an arithmetic exception propagating through the system. We will put some more error checking in there to catch this.
Sorry for the mis-diagnosis.

Mark, I’ve seen that before, where the motor outputs all converge. Actually, it happened to me on a helicopter. It was the exact same thing actually. If you plot the RC Outputs vs. the CTUN/ThrOut, you’ll see that the 4 RC Outputs basically track the ThrOut. So what has happened is the stabilization routine has failed, and only the throttle controller is still running.

In my case, it was a bug in the Acro mode way back in the 3.2 dev-cycle, which was fixed. I imagine your case here was different root cause, but same outcome.

And actually, it’s was a pretty similar situation. I was trying to figure out another problem with Acro, and I just kept testing over and over, until eventually the process got broken like this and it crashed.

In that case, we never could figure out exactly how the problem propagated to the stabilization controller, but the fix of the original Acro problem resulted in the crash not reoccurring.

So, thanks for the bug find, sorry about the crash. But you have helped make the program better.