Crash following 'RCInput: decoding SBUS'

We just had a copter fall out of the sky… It was flying a waypoint mission just fine, controlled by our custom GCS, then we had this:

13 Feb 2020;15:36:44.503 Statustext{severity=EnumValue{value=7, entry=MAV_SEVERITY_DEBUG}, text=RCInput: decoding SBUS}
...
13 Feb 2020;15:36:44.512 ParamValue{paramId=MOT_THST_HOVER, paramValue=0.31309304, paramType=EnumValue{value=9, entry=MAV_PARAM_TYPE_REAL32}, paramCount=1033, paramIndex=65535}
...
13 Feb 2020;15:36:44.599 ExtendedSysState{vtolState=EnumValue{value=3, entry=MAV_VTOL_STATE_MC}, landedState=EnumValue{value=1, entry=MAV_LANDED_STATE_ON_GROUND}}

After which it turned off motors and crashed.

1 Like

Please post some logs somewhere.

Obvious question first, 'though - did someone turn a transmitter on just before the thing came down?

Hi Peter,

You can find the crash flight log here:

And, for reference, a log of a successful flight (a few minutes earlier) can be found here:

Both flights were automatic. We have an on-board mission computer that controls the flight logic. However, we also have a receivers on-board that can be used to take manual control during flight if needed (although it wasn’t actively used in either of these flights).

Looking at the crash log it seems as though the receiver only got loaded/recognized in the middle of the flight, which led to some init sequence by the flight controller, which eventually shut down the motors in mid-air. This can either be a physical problem (the connector might have been partially pushed?), or possibly a software bug that only loaded the receiver mid-flight?

And if this is the case, is there any pre-arm check we can/should use to verify the receiver is loaded prior to takeoff? Or, other way to prevent receiver-loading in mid flight from shutting down the motors?

** I’ll just clarify that the controller was sitting on a table and was not touched by anyone during the entire flight. But we also didn’t verify that it got connected prior to takeoff…

Many thanks!

Looking at the crash log it seems as though the receiver only got loaded/recognized in the middle of the flight, which led to some init sequence by the

This is, indeed, the cause of the crash. You have arming-on-a-switch set
on channel 7. That switch was in the low position. When the autopilot
started to process the SBUS input it acted on the switch.

flight controller, which eventually shut down the motors in mid-air. This can either be a physical problem (the connector might have been partially
pushed?), or possibly a software bug that only loaded the receiver mid-flight?

Yes, that’s the interesting question.

I think it is quite safe to rule out anything being “loaded” - nothing
like that happens in this bit of our code.

It is possible that there’s a software bug which didn’t allow valid
input to be decoded correctly. I would rate the chances of this
extremely unlikely.

Electrical problem is possible. Airside protocol incompatability also.

Lastly - to try to understand the late connction of the receiver - some
radios are smart enough to keep their own logs of what’s been going on -
you might check those logs if you are lucky enough to have such a
receiver.

Check to see if you can get data into the RSSI logs, too. I never have,
but…

And if this is the case, is there any pre-arm check we can/should use to verify the receiver is loaded prior to takeoff? Or, other way to prevent
receiver-loading in mid flight from shutting down the motors?

Do you have a log of the mavlink messages from your companion computer?
I have to assume that’s what armed the aircraft (given the log I’m looking
at).

What you should have received was an arming failure like this:

APM: PreArm: Throttle below Failsafe

Did you ensure that RC failsafe was working on this vehicle?

FS_THR_ENABLE is not set, so we rely on the RC receiver setting a bit in
the protocol to indicate that it isn’t getting data. Sadly, several
receivers do not set that bit.

** I’ll just clarify that the controller was sitting on a table and was not touched by anyone during the entire flight. But we also didn’t verify that it
got connected prior to takeoff…

The logs appear to indicate there was no connectivity until the fateful
event. No RSSI message present in log, so that doesn’t tell us anything.

The RC unit was present for the rather-more-successful previous flight.

Note: the firmware you are flying on that aircraft is known-flawed.
Please consider moving to a more recent point release (or check the
release notes and assure yourself you are not affected).

Peter

Peter,

The companion computer did arm the aircraft:

13 Feb 2020;15:35:35.382	Switching to GUIDED mode to arm
13 Feb 2020;15:35:35.384	W: SetMode{targetSystem=0, baseMode=EnumValue{value=1, entry=null}, customMode=4}
13 Feb 2020;15:35:35.397	R: CommandAck{command=EnumValue{value=11, entry=null}, result=EnumValue{value=0, entry=MAV_RESULT_ACCEPTED}, progress=0, resultParam2=0, targetSystem=0, targetComponent=0}
13 Feb 2020;15:35:35.401	W: CommandLong{targetSystem=0, targetComponent=0, command=EnumValue{value=400, entry=MAV_CMD_COMPONENT_ARM_DISARM}, confirmation=0, param1=1.0, param2=0.0, param3=0.0, param4=0.0, param5=0.0, param6=0.0, param7=0.0}
13 Feb 2020;15:35:35.403	R: Heartbeat{type=EnumValue{value=13, entry=MAV_TYPE_HEXAROTOR}, autopilot=EnumValue{value=3, entry=MAV_AUTOPILOT_ARDUPILOTMEGA}, baseMode=EnumValue{value=89, entry=null}, customMode=4, systemStatus=EnumValue{value=3, entry=MAV_STATE_STANDBY}, mavlinkVersion=3}
13 Feb 2020;15:35:35.406	W: SetMode{targetSystem=0, baseMode=EnumValue{value=1, entry=null}, customMode=3}
13 Feb 2020;15:35:35.418	R: CommandAck{command=EnumValue{value=400, entry=MAV_CMD_COMPONENT_ARM_DISARM}, result=EnumValue{value=0, entry=MAV_RESULT_ACCEPTED}, progress=0, resultParam2=0, targetSystem=0, targetComponent=0}
13 Feb 2020;15:35:35.421	R: GpsGlobalOrigin{latitude=321324684, longitude=348184534, altitude=30150, timeUsec=0}

But there was no PreArm: Throttle below Failsafe message.

Note that there was no message about disarming, but landedState was MAV_LANDED_STATE_ON_GROUND immediately following the RCInput. Is that consistent with your theory?

We will try and investigate why the RC wasn’t connected in the beginning and why it did connect when it did.

However, given that it did what it did, would you say that ArduPilot behaved correctly, or is it a bug? It seems to me that disarming during a flight like that is, at the very least, not desirable.

The companion computer did arm the aircraft:

Thanks for that. The relevant message in there isn’t a force-arm, so
that’s good. It means that the existing arming checks aren’t being
bypassed.

Any idea why you have excluded bits in your ARMING_CHECK? Those checks
are there for a reason - if there’s a use-case for legitimately disabling
them it would be nice to see if we can find an alternate solution.

But there was no PreArm: Throttle below Failsafe message.

Right. That’s consistent with the throttle failsafe being disabled.

Note that there was no message about disarming, but landedState was MAV_LANDED_STATE_ON_GROUND immediately following the RCInput. Is that consistent with
your theory?

Yes, if you’re disarmed you’re considered landed. In a multicopter, if
you’re not on the ground when disarming you shortly will be…

We will try and investigate why the RC wasn¢t connected in the beginning and why it did connect when it did.

However, given that it did what it did, would you say that ArduPilot behaved correctly, or is it a bug? It seems to me that disarming during a flight like
that is, at the very least, not desirable.

I really can’t say that until you answer my question on “did you ensure
that when you turned your RC transmitter off that the vehicle responded to
that.?” If you can, with what remains, try it now. If you turn the RC
transmitter off - does the vehicle acknowledge that via its telemetry
link?

I think there is a case here for seeing if we could do better. Even
assuming my theory is correct - that the vehicle wasn’t configured for
detecting loss-of-RC - we should see if we can do better. This could have
been the other way around; it could have been on the ground, disarmed, and
the transmitter finally latches on with the arm/disarm switch in the “arm”
state - that might lead to inadvertant arming, which would be bad.

It is a bit hard to see how we could do better, however. If you flick an
“RTL” switch on the transmitter because the vehicle is way too far away,
and your RC is marginal - we want that vehicle to RTL as soon as you do
get any RC through to it.

Requiring a transition for a switch to take effect has the initialisation
problem; we don’t want people to have to flick everything on their
transmitter before flying their aircraft!

For us, that’s a world of difference.

We have a separate parachute system which the companion computer arms on take off and disarms on land. When the companion computer was told it was on the ground, it disarmed the parachute.

In this particular test flight we didn’t have the physical parachute installed, so it wouldn’t have helped anyway, but it’s worrying to know that we can receive on_ground when still in the air.

A point with discussing. I’ll bring it up at the devcall.

Peter

I think there is a case here for seeing if we could do better. Even
assuming my theory is correct - that the vehicle wasn’t configured for
detecting loss-of-RC - we should see if we can do better. This could have
been the other way around; it could have been on the ground, disarmed, and
the transmitter finally latches on with the arm/disarm switch in the “arm”
state - that might lead to inadvertant arming, which would be bad.

There seems to be a mix of “edge-triggered” and “level-triggered” concepts here, which leads to confusing dependency on the order of events.
The “arm by companion computer” is edge-triggered, which overrides the position of the “level-triggered” RC switch, but only until the signal is lost and then regained, at which point this signal is what matters, without any active action being taken by the pilot. Surprising.

One way to improve this is to move to tri-state logic. Each command source returns one of (ON, DONT_CARE_OR_NO_INFO, OFF). The sources are then combined in priority order: the highest priority source is checked first. If this source is DONT_CARE, move on to the next in the priority list.
This would provide determinism and clarity that would allow configuring the system in a way that makes sense for specific requirements.
Of course, it would be possible to misconfigure the system even with this approach - such as have the intermittently connected RC transmitting “OFF” state for the arming bit.

not quite correct, when FS_THR_ENABLE=0 then all RC failsafe is ignored, no matter if SBUS or any other

I’ll be resurrecting https://github.com/ArduPilot/ardupilot/pull/10166

Note the last comment on that PR re: activating parachutes - it’s unrelated to the parachute issue mentioned in this PR.

We did discuss the “landed” state today. It’s hard to change in ArduPilot - and if we were to change it, it would start to be even more heuristic than it currently is!

Perhaps you could arm your parachute based on height-above-terrain or just relative-alt instead?

We do not intend to have an RC in production, so this particular case isn’t critical. Are there other cases where could encounter landed_state_on_ground state when not actually on the ground?

That’s would require us to know the terrain, which we don’t.

I’ve found the bug and posted a quick fix here, I didn’t know about this issue. It’s a very old bug that is affecting all stable and development branches I think.
Bug report:


QuickFix:

@Jaaaky,

Thanks for that fix, hopefully it will go out today in a new beta, Copter-4.0.3-rc1.

From discussion with @peterbarker, we think that this user’s crash is not caused by the bug you’ve found. Definitely the “keywords” are similar in that both happen when “RCInput: decoding…” is printed but we think this crash is caused by the RC receiver being turned on and the arming-on-a-switch being triggered (i.e. the transmitter switch was in the “disarm” position so when the transmitter was turned on it took effect)

The evidence for this is that this user’s log shows the altitude falling to the ground :frowning: but if it was a watchdog reset the logs would have ended suddenly.