RTL freakout failure and crash aka Flyaway

So after beating up the Naza guys over my flyaway with that unit, now I have a Pixhawk flyaway story too. Difference is I have log files this time. I am really hoping the outcome with support and customer service will be different this time around. We shall see.

Attached is my logfile.
[attachment=0]33.BIN[/attachment]
[attachment=1]33.log[/attachment]

I also have video but have not uploaded it yet. I was able to extract some exact times that events occurred though. Some explanation as to what was going on at the time.

Airframe = TBS Discovery Pro quad, motors and ESCs, Pixhawk Arducopter V3.1.1.
The purpose of the flight was to test RTL mode with radio failure, so I planned to turn off the Tx (FrSky Taranis).

Powered up and waited for all green flash on Pixhawk.
11:37:46 Armed Pixhawk.
11:37:53 Motor start.
Took off in STAB mode, flew for about a minute.
Switched to LOITER mode, flew for about a minute.
11:39:44 Moved to far side of our runway, about 5 feet above the ground, and turned off the Tx.
Aircraft began to climd to about 40 ft (I forgot what this is programmed to go to).
11:39:52 Aircraft stopped climbing, and began moving towards launch point.
11:40:05 Aircraft arrived over launch point, stopped, and began decent.
11:40:10 Aircraft began to depart rapidly to the southwest, and slightly descending.
(see #6 below for additional notes on what I tried on the RC radio)
11:40:17 Aircraft impacted terrain. All passengers were killed. Broken gimbal frame, gopro 3 black, motor arm. Estimate so far is about $600 in damage.

I’ve read the log files as much as I understand. I’ve looked for the common issues, but I see none. When filtering on “NTUN” I see some suspicious data:
1] The names for the values do not match the documentation
2] At 11:40:00, the DposX and DposY, which I am assuming are DesiredPositionX and DesiredPositionY (these are not documented) go to 0 and stay there for about 10 seconds. The moment the data returns to non 0 values is the same moment the flyaway started. From there you can see it accelerating towards some unknown point.
3] GPS HDOP was not perfect, but was improving and had not changed during the RTL. Also the map plot shows the actual flight path that I observed, so the Pixhawk knew where it was.

Some other facts that I don’t think mattered, but you never know:
4] I had a new 5.8GHz video TX on board, although I was not using it, it was powered. Since GPS was reasonably OK and the 2.4GHz RC was close by, I don’t see any issue here.
5] GoPro was not in Wifi mode
6] I did turn the RC TX back on around 11:40:00, not exactly sure but it was when the aircraft was in RTL and headed back to home. I was attempting to call off the RTL and return to LOITER, but honestly I do not know what the behavior should have been after the radio loss event. The radio booted and I flipped the switch from LOITER to RTL and back to LOITER, but it had no effect. So I decided to let it fully land.
During the flyaway event, the Tx was still on, and I attempted to select STAB, LOITER, and RTL, with no effect.

So that’s it. Not sure a flyaway can be better documented than that, but feel free to ask any questions. I can UL the video later if anyone thinks it will help. Thanks in advance for any help.

Wow, you have some real weirdness going one with this one. In general, it was behaving the way it was supposed to. When you switched off your radio it initiated a RTL. It climbed to 15 meters (the RTL altitude you had set), then it started off to where it THOUGHT home was, and looks like it was starting its landing descent when it hit the hill which was higher than its launch elevation. So it looks like it was a GPS issue. Sort of. (Vibration wasn’t an issue, they all look great)

Here’s the deal…When you lifted off you had an HDOP of 3.5. That really sucks. But from what I can see of your parameters, your HDOP_Good is set to 2.00, AND you have your arming check enabled. So as far as I can tell, your Pixhawk should have never permitted you to arm with such a lousy GPS lock (you had just barely acquired 6 satellites). Why it did I don’t know.

Curiously, the first recorded GPS location was your precise liftoff point. That GPS point seems accurate. I would have thought that if you were getting bad GPS data, it would have been at the hill where it was headed after the RTL, where it thought it’s launch point was.

Perhaps a Dev could have a look at this, and why it allowed arming and flight with such marginal GPS quality. That’s a bit out of my somewhat limited knowledge pool.

Your VTx shouldn’t be an issue.

If you call the RTL event a three legged sequence of up, over, then down, it did the first two legs perfectly and on the expected course. Then right after it started the third leg, it took off towards the hill. By the video it looks like it was descending at the proper rate, but it was also speeding towards the SW at 15mph.

The altitude drop is easier to explain, you had a speed of 15m/s (54km/t) this is very much for a multirotor, and probably more forward angle than it can maintain altitude at. - also, if you had any headwind, that would be even more difficult.
your WPNAV_SPEED is 5m/s , so I do not know why it got to 15m/s

AFAIK current arducopter do not priority to maintain altitude over getting to desired speed (needs fix)

I would say very bad compass, but it travels in a straight line, which is strange.

Hope a developer can take a look at this.

Hi guys, I’ll try to dig into this one early next week if nobody else has yet. Trying to get some flying in while I can! :slight_smile:

Yes, can’t understand either why it let you arm and takeoff.

Just a note in that respect. I find the 3D Fix indication on Mission Planner a little misleading in the sense that it gives one the impression that the GPS is ready to go.

You have 6 set as minimum number of sats, but as mentioned already, you should require at least an HDOP of 2 or less to arm. That usually means waiting a good 3 minutes, maybe even more if the GPS does a cold start. One of the first things I do when I arrive at the field is put the juice to the quad, then putz around with all the other stuff in the meantime while I wait.

Manual flight, IF GPS prea-arm checking is disabled, should be possible, but it doesn’t mean you are ready for any GPS assisted modes.

I see you have the options of Stab, Loiter RTL and Circle? available on your switches.

However, once you turned off the radio, you never did get it back. So any attempts to take over would not have been seen.

I know on my Fly-Sky FS-TH9X, which aside form being several levels of funcionality lower than the Taranis, behaves quite similarly to the Taranis in a number of ways. When I turn my TH9X back on, it usually presents me with a number of alarms, which don’t actually permit Tx power until the alarms are bypassed.

With the radio back on, I’m sure you could have gone to stab and saved it.

I have a Taranis, but haven’t gotten around to using it yet so can’t speak for how it reinitializes. Give that a check. I have never done an RTL test by switching off my TH9X, because it take s too long to get the Tx back in commision after turning it off.

I find this one fascinating so I’ve been looking at it some more. I’m thinking probably not a compass issue as the declination value looks appropriate for the flight location and compass offsets look reasonable.

From the attached plot you can see the problem appears at line 23410. In the NTUN parameters the DPosX and DPosY (desired X and Y positions) suddenly start going negative. They are the red and green lines. The quad is thinking the distance between where it is and where it needs to be is increasing. This is reflected in the increasing speed, shown by the blue line.

This is extremely odd because the GPS position at the time is almost over the launch point. The are 8 satellites with a decent HDOP of 2.3. And up until the point the quad changes its mind (at line 23410), it was thinking it was exactly where it needed to be.

If this were a Naza you’d have no clue why the flyaway happened and would be going “WTF??” Instead you can see exactly what happened…and still end up going “WTF??” At least we know it wasn’t headed to China. The bearing has it headed to Australia. Or possibly the 3DR offices in San Diego.

@Andre-K: The 15m/s comes from WPNAV_LOIT_SPEED, which I have set up to 15m/s, so it basically performed as expected in that regard.
Also from the CTUN line 23397, the DAlt (Desired altitude) went to 0, which is consistent with the RTL sequence of events.
To me it looks like it thought it was doing a normal RTL landing, however instead of holding its position, it seems like it suddenly was trying to correct it’s position to a quickly moving target.

@SkyHawkDP: I think you are correct that the Taranis may have not been happy with switch positions, and therefore not started transmitting. The switch position for LOITER is middle position of a 3 position switch, so I may have had it there during boot, and honestly I wasn’t listening to the TX voice much, I was focused on the aircraft. I know I did cycle the mode switch as well as the other switch I have for RTL override, however maybe I never satisfied the switch position warnings.
Either way, I see no change in RCIN values to show that it got anything after I initially turned it off. I kind of concluded something like that too, so since the radio seems to have not given it any commands, I assume it should have completed the RTL.

Looking forward to you analysis Rob_Lefebvre. Enjoy your flying today!

@OtherHand: I think you are on the right track, and yea, same WTF feeling since I’ve had flyaways on both brands, but there is a glimmer of hope here to get to the root cause.

Video is here:
vimeo.com/93915783
Not sure this is anything that the log file doesn’t already show. I did my best to sync the timer with the log file.

Very nice gophercam!

Actually that wasa very interesting video. It looked like your RTL was doing exactly as it should, and looked like it had reached a point immediately over your launch point. I’m guessing it was just starting its descent when it glitched and took off. You can clearly see the lurch.

So I thought, “Ah hah, mechanical failure!”. But looking at your roll versus desired roll and pitch versus desired pitch showed your quad was doing exactly what your Pixhawk was asking it to do. The mystery remains (to me, anyway) as to why the Pixhawk was requesting the actions it did. And then there’s still the letting you arm with crappy GPS thing.

As stated by others, there was no problem with your setup - your position and velocity tracked the GPS reference quite closely, and the GPS was fine. The problem was that the navigation demand caused the target point to move. The radio failsafe was never resolved - all RC channel inputs stayed at 900 throughout the crash.

Still looking at the cause of the crash, but my current theory is that the extreme RC inputs caused the landing repositioning to move the target position. I still need to check on that.

I had the same thing happen to me. Thankfully a better outcome (I regained control and brought it back) but in general the exact same thing, which I documented here:

http://diydrones.com/forum/topics/pixhawk-bug-rtl-flyaway-problem-v3-1-2?xg_source=activity

The TL;DR version is that I did a number of RTLs, Loiters, Auto and Guided modes - flawless! Everything worked perfectly until I did a radio failsafe. The craft returned to my RTL spot, as it had previously a few times - but then picked a “random” direction and took off in a straight line:


Started RTL at the top right. “Home” is the point at the corner, regained control at the top left.

I had done several RTLs that day via my mode switch and they worked exactly as expected, including at least one in that log file, I believe. I have done many RTLs since with the mode switch, and they also have performed exactly as expected.

“Dom”, on the DIYDrones report page, had the same problem. He added the following:

Someone else reported the same problem here:

http://ardupilot.com/forum/viewtopic.php?f=25&t=7484

My logs are on the DIYdrones page, or here:

https://dl.dropboxusercontent.com/u/50675/pixhawk_flyaway/pixhawk_failsafe_flyaway.BIN
https://dl.dropboxusercontent.com/u/50675/pixhawk_flyaway/pixhawk_failsafe_flyaway.tlog
https://dl.dropboxusercontent.com/u/50675/pixhawk_flyaway/pixhawk_failsafe_flyaway.kmz

Any insight would be very helpful.

Looking at my KML, this video is eerily similar to what happened to me: It came back from the upper right, took off to the upper left. There was a sudden jerk, then it took off. Not as fast, but there could be any number of reasons for that (speed settings, most likely).

vimeo.com/93915783

Yes I had my nav speed set up to 15m/s.

I’m reading in Github:
ardupilot/ArduCopter/control_rtl.pde @ line 258:

[code] static void rtl_descent_run()
{
// if not auto armed set throttle to zero and exit immediately
if(!ap.auto_armed || !inertial_nav.position_ok()) {
attitude_control.init_targets();
attitude_control.set_throttle_out(0, false);
// set target to current position
wp_nav.init_loiter_target();
return;
}

// process pilot's input
float target_yaw_rate = 0;
if (!failsafe.radio) {
    // apply SIMPLE mode transform to pilot inputs
    update_simple_mode();

    // process pilot's roll and pitch input
    wp_nav.set_pilot_desired_acceleration(g.rc_1.control_in, g.rc_2.control_in);

    // get pilot's desired yaw rate
    target_yaw_rate = get_pilot_desired_yaw_rate(g.rc_4.control_in);
} else {
    // clear out pilot desired acceleration in case radio failsafe event occurs while descending
    wp_nav.clear_pilot_desired_acceleration();
}

// run loiter controller
wp_nav.update_loiter();

// call z-axis position controller
pos_control.set_alt_target_with_slew(g.rtl_alt_final, G_Dt);
pos_control.update_z_controller();

// roll & pitch from waypoint controller, yaw rate from pilot
attitude_control.angle_ef_roll_pitch_rate_ef_yaw(wp_nav.get_roll(), wp_nav.get_pitch(), target_yaw_rate);

// check if we've reached within 20cm of final altitude
rtl_state_complete = fabs(g.rtl_alt_final - inertial_nav.get_altitude()) < 20.0f;

}

[/code]

So we see that if were are NOT in radio failsafe, we get RC inputs, otherwise we clear the pilot/RC inputs. This is too big of a project for me to wrap my head around right away, and my C++ understanding is limited. However I’m wondering if the state of failsafe.radio could be changing after the initial failsafe occurred. I assume it really did occur because the mode change in the log tells me that, but where else might it be changed back, and would the logs of the mode reflect that?
Also, am I correct in assuming that we are executing this code when RTL from a failsafe?

troystrum, if you look at your log you’ll see that around the time you had your “flyaway” in your RTL mode, you had a series of GPS glitches (In fact, they started just before you shut off your radio). Your HDOP spiked repeatedly to 3.5 for very brief periods. See the attached plot, where the red line is your HDOP and the green line is the number of sats. Now an HDOP of 3.5 isn’t horrendous, but it’s not good. Normally I wouldn’t think it would cause a flyaway event, but it’s possible. So in your case there’s a possible reason for the weirdness. Kf6bbl’s situation appeared completely normal until it took off.

Hi, this is ‘Dom’ from the other thread (it was early morning and I was half asleep when I chose fluffyflyer, now I somewhat regret it, like getting a drunk tattoo).

My log is attached to the diydrones comment:
diydrones.com/xn/detail/705844:Comment:1633689

It appears to be completely reproducible and I’m happy to reproduce if you want for testing or further diagnosis - I have large fields with soft long grass and I’m now a dab hand at crashing without doing too much damage - the long grass is amazing at protecting the frame and props from damage even at fairly high speed crashes.

Given that this issue has been reproduced, I’m glad to give it another shot and confirm that I’m able to reproduce it again with my hardware. (Perhaps I’ll lower my nav speed to lessen the panic!)

I’ll stress again that this setup has given me exactly ZERO problems flying autonomously, aside from this one unexpected behavior during a radio failsafe RTL. I’ve likely had about 4 hours “in the air” using combinations of Loiter, Guided and Auto, and never has there been any odd behaviors. It’s performed as expected even under less than ideal wind conditions.