Radio failsafe with CRSF with Yaapu Telemetry with FrSky Taranis; log analysis help requested

CBwintertime · April 27, 2021, 8:09pm

My flight this morning had multiple problems. My ability to analyze logs is “novice” at best so I’m looking for some help. I’m experiencing radio failsafes at a distance of about 200m, line of sight, with a TBS crossfire setup (not supposed to be happening). The thing is I can’t tell if I have an actual radio problem or whether AP thinks there’s a radio problem.

Here’s my log from the FC flashcard

I’m using a TBS crossfire (standard) transmitter on a FrSky taranis, and I’ve got the nano receiver (not diversity). I’ve got the transmitter power set to 500mW max, dynamic. I’ve got the antennas connected properly (so far as I can tell–suffice to say I’ve paid attention to them). I’ve got the “immortal T” mounted on one of the quad arms, perpendicular to the arm.

First, ignore the crash at the end caused by me stupidly ignoring the battery dying.

Take a gander at the first RTL at 2:39. I see a radio RSSI (RAD.RSSI) between 85 and 95, which shouldn’t be a problem, right? Nevertheless I’m getting a radio failsafe that is triggering an RTL. At first I thought that perhaps the TBS system is losing connection suddenly and outputting zeros and that’s triggering the AP to call failsafe, but looking at RCIN.C1-4, they’re all continuous without dropping out. Any ideas? thanks in advance

Allister · April 27, 2021, 9:02pm

Oh man, now you have me curious. I’ve had one of those happen to me too recently. I figured it was just “one of those things” and didn’t think any more of it. But now… Similar setup. CRSF Micro TXv2 on a TX16s. Nano RX. AC 4.1.dev.

Are you running the Yappu telemetry? Are you running CRSF and MAVLink?

I had a late frame rate error followed by a radio fail safe. (RTL was triggered). There was no error or Telemetry warning on the radio. I was in an open field maybe 50m away, but I’d been flying much further just before that without issue.

Sorry I can’t offer any solution but if this turns out to be an issue I’ll post my log to help troubleshoot.

CBwintertime · April 27, 2021, 9:28pm

yup, running yaapu telemetry, CRSF and MAVLink, AC4.1.0-dev

xfacta · April 27, 2021, 9:31pm

You’ve almost constantly got the radio late frame and failsafe messages - I’m not sure how you’d fix that but obviously it’s the main thing to work on.

Your Y axis vibrations are not ideal, so something is rubbing or pulling on the flight controller in the Y axis.
Motor PWM outputs are very low, indicating overpowered. Maybe add a bit of dummy weight to ensure stability. Other than that attitude control looks pretty good, just needs an Autotune at some point.

I would definitely set up all these parameters:
BATT_FS_CRT_ACT,1
BATT_FS_LOW_ACT,2
BATT_ARM_VOLT,22.10
BATT_CRT_VOLT,21.00
BATT_LOW_VOLT,21.60
MOT_BAT_VOLT_MAX,25.20
MOT_BAT_VOLT_MIN,19.80
FENCE_ACTION,3,
FENCE_ALT_MAX,50 - Adjust for safety or local laws
FENCE_ENABLE,1 - Wait for valid GPS 3D fix
FENCE_RADIUS,100 - Adjust for safety or local laws
FENCE_TYPE,3 - Radius + Altitude

I think those battery settings will suit you, use the MissionPlanner Alt A keypress/plugin to check.
It’s a mistake to think “I can set the battery failsafe stuff later… I’ll just watch the voltage myself”.
That Fence will make you wait for a suitable GPS fix before being able to arm in ANY flight mode. It means RTL will have a good home position to return to when it needs it most.

Edit: you can set up the BATT2 ESC telem settings too if you like, but I find the normal powerbrick/voltage divider is better because you can calibrate it more accurately at the low voltages you’d expect to see for your particular battery

CBwintertime · April 28, 2021, 12:00am

thanks for the advice. I hadn’t noticed the y-vibrations, now assessing those and tuning properly is tomorrow’s task. Still don’t understand the radio failsafe issues, or know how to proceed with that.

This is unrelated to the radio failsafe, but on that flight I actually screwed up and used a 6S that looks identical to my 4S. I was going by the power consumption rather than voltage, and the power consumption was way off (by my calculations I was only 70% consumed). But I’m going to incorporate your fence and battery FS options for my 4S batteries. Is there some failsafe that will prevent arming if I put in a 6S instead of a 4S? I modified your param suggestions for 4S, this is what I’ve got now:
BATT_ARM_VOLT, 14.7
BATT_CRT_VOLT, 14
BATT_LOW_VOLT, 14.4
MOT_BAT_VOLT_MAX, 16.8
MOT_BAT_VOLT_MIN, 13.2
BATT_FS_CRT_ACT, 1
BATT_FS_LOW_ACT, 2
BATT_FS_VOLTSRC, 1
FENCE_ACTION, 3,
FENCE_ALT_MAX, 120
FENCE_ENABLE, 1
FENCE_RADIUS, 400
FENCE_TYPE,3

Allister · April 28, 2021, 1:29am

I’ve made mention of this thread on the CRSF Passthrough thread. See if anything comes up there.

UnknownPilot · April 28, 2021, 5:09am

I had close range failsafes last autumn on a quad with a similar setup (Crossfire Lite Module, QX7, Micro, then Nano Diversity Receiver, CRSF/Yaapu), but I thought I had those tracked down to first a broken receiver (Micro) and then to interference from a RunCam 4k mounted on top. The weird thing in the latter case though was that RSSI (LQ) never went down before the FS. Either way, I’m very curious to see what comes out of this thread.

andyp1per · April 28, 2021, 6:30am

FWIW this should be in the 4.1 topic as CRSF is only available in 4.1. It’s also a beta issue if there is an issue.

CBwintertime · April 29, 2021, 3:36pm

thanks, I fixed the topic to 4.1

CBwintertime · April 29, 2021, 5:26pm

okay, I have more data and I’m even more confused. I did a nice clean flight this morning with one radio failsafe for easier analysis (if only).

The onboard flash log RSSI information seems disconnected from anything sensible. I am pasting below both the onboard flash log and my taranis telemetry log (which has more sensible RSSI quantities).

The link quality logged by the Taranis never goes lower than 80, on either the TQly or RQly (note it’s only logged at 1Hz, so a spike would be missed). The RSSI.RXRSSI value logged to FC flash goes from 99 to 0.34 and the RAD.RSSI goes between 170 and 35. What do those numbers mean? Not only do units seem incorrect, but it also doesn’t seem to correlate with any of the curves recorded by the Taranis. I’m very curious where the FC is getting those numbers. I have my parameter set to RSSI_TYPE, 3 which is what I thought would successfully interpret the CSRF rssi data from the telemetry stream to the FC (without needing it explicitly sent over a separate RC channel).

Aside: for correlating the two different logs I used the yaw data, as it’s clear where the failsafe happens (the yaw goes linear during the RTL). Is there a time-of-day stamp recorded anywhere on the onboard log that could be better used for this purpose? I had a hell of a time trying to find the same section of data on the Taranis log as the FC log.

Even without understanding the RSSI values that were logged, I still can’t find a reason for the failsafe that happens. The one thing I do see is that the TBS tx jumped up to max power output (500mW) right at the failsafe. It did this despite not seeing any problems with LQ or RSSI. It switched between mode 2 and 1 a bit which is fine, but never went to 0. So why did the TX decide it needed 500mW all of a sudden? And why wasn’t that reflected anywhere in the LQ?

I’m so confused. Am I losing all connection for a few seconds when it switches from mode 2 to mode 1? There’s a chicken/egg problem here, where my telemetry coming to the taranis is not going to be accurate if the connection is dropped, so I don’t know which data if any to trust here.

To reiterate, my RCin signals never “cut” in a way to trigger the failsafe (that I see). So I still don’t know whether the TBS is triggering the FS on it’s end, or the autopilot is triggering it on the drone side. Could the power to the RX be glitching? If the RX lost power, the autopilot would register a FS, possibly when the TBS tx wouldn’t notice it, right?

my FC log

logs from the CRSF telemetry back to the Taranis:
yaw (for lining up the timing)

TPWR

RQly with sensible values

RFMD (crossfire mode, going from 2 to 1 but never 0)

andyp1per · April 29, 2021, 6:05pm

The RSSI that AP uses for CRSF is a fudge on TBS’ recommendation. I didn’t use LQ because that appears to be quite binary (goes from 100% to very little very rapidly) so instead use dbM values to approximate RSSI.

CBwintertime · April 29, 2021, 6:09pm

I attached a plot with the RSSI in dbM, I don’t see how that relates to the FC log RSSI, can you help me understand? The TBS ramped up dbM as distance increased, but the FC log shows an inverse behavior.

Perhaps I should go ahead and pipe LQ over a separate channel (though I’m using all 8 at the moment, don’t have a square to spare).

andyp1per · April 29, 2021, 8:16pm

This is the calculation

    uint8_t rssi_dbm;
    if (link->active_antenna == 0) {
        rssi_dbm = link->uplink_rssi_ant1;
    } else {
        rssi_dbm = link->uplink_rssi_ant2;
    }
     // AP rssi: -1 for unknown, 0 for no link, 255 for maximum link
    if (rssi_dbm < 50) {
        _link_status.rssi = 255;
    } else if (rssi_dbm > 120) {
        _link_status.rssi = 0;
    } else {
        // this is an approximation recommended by Remo from TBS
        _link_status.rssi = int16_t(roundf((1.0f - (rssi_dbm - 50.0f) / 70.0f) * 255.0f));
    }

rmackay9 · April 29, 2021, 11:31pm

@CBwintertime,

The AP RCIN logs probably won’t show the inputs dropping to zero when the transmitter & receiver lose connection. The values shown in RCIN are always the last known values from the receiver. Different receivers signal a failsafe in different ways. For example Futaba receivers will often drop channel 3 to a very low value while most others will simply stop sending signals (indicated by the “Late Frame” or “RADIO-2” errors in the AP logs). AP interprets either these low values or late frames as a failsafe.

As you mentioned early in the thread, to get to the bottom of this, it’s important to know if it’s an AP problem incorrectly recognising an RC failsafe or a problem in the Transmitter/Receiver.

The RCIN values seem to freeze during the failsafe event and we see these “late frame” (“RADIO-2”) message in the logs so AP doesn’t seem to be receiving new RC data.

It is possible that AP is dropping RC input packets but if this were the case then I think the problem would happen at any time unrelated to distance from home. I suspect you’ve already concluded that it is distance related but if not, flying a few batteries very close to home to confirm it doesn’t failsafe and then do a couple of longer flights will probably make this clear.

So I think we probably need to conclude that the issue is in the Receiver/Transmitter and not related to the AP 4.1 software but I’m happy to be corrected because of course we want to get to the bottom of any problems with 4.1.

Thanks again for testing Copter-4.1 and obvious the discussion should continue if others have ideas on resolving the issue.

Allister · April 30, 2021, 1:31am

Just because, here’s my log with the fail safe. Mine only occurred once.

https://cp.sync.com/dl/a2d831a90/yz43n8ky-6it44dgf-z7p6rp3j-iqcebu4m

(Please, no comments about the vibrations… I know…)

CBwintertime · April 30, 2021, 1:31am

Ok, I swapped out the nano receiver, with a nano diversity this time. Set it up the same way. Also replaced the antennas. Flew it again, same failsafe issue. So evidence suggests it’s not the receiver. It could still be the transmitter I suppose, though generally I would expect higher reliability from the transmitter than receiver.

I have not yet experienced the failsafe within 100 ft of home, despite flying in circles for a while, though I might still have been simply bucking the odds. Tomorrow I can fly a few packs around in close circles to see, but it has still felt somewhat correlated with distance. After replacing the receiver, I am leaning towards the issue being AP 4.1 (CSRF telemetry-related perhaps) but we both agree that doesn’t make much sense if it’s distance-related. Except, come to think of it, perhaps it’s related to the switch to mode 1? Does mode 1 make a difference in the AP4.1 CSRF implementation somehow?

Could this be related to me flying this on a Kakute V1.5 with only 1Mb flash? Is there anything about that custom build you did for me that would affect it?

(unrelated I assume, I’m getting intermittently one motor failing to start–the same front left one each time–and if I re-power it seems to remedy it, I’m guessing there’s some ESC issue, but I might as well mention that the radio failsafe is not the only issue I have at the moment)

rmackay9 · April 30, 2021, 6:46am

I’ll leave @andyp1per to respond regarding CSRF mode 1 because I know exceptionally little about CSRF.

I wasn’t aware that you were flying a custom build but from peeking at the log again I see indeed it is “Copter 4.1.0-DEV”. If possible it would be helpful to use the beta instead (Copter-4.1.0-beta1). If one of the other devs provided this binary as part of ongoing testing then that’s cool of course but in general it is best to stick with a known version. The “latest” (aka “master”, aka “dev”) versions actually change multiple times per day and flying bleeding edge makes it difficult for the developers to provide support.

andyp1per · April 30, 2021, 7:16am

It could be related to the switch to mode 1. I think CRSF allows you to set the max transmit power - if you turn it down can you trigger mode 1 earlier and if so what happens? We did successfully test the mode 2 to mode 1 transition so maybe there is some kind of edge case.

I’m guessing you are using dshot with the motor not turning issue - it would be good to validate this on 4.1 beta as there have been a number of fixes in this area

CBwintertime · April 30, 2021, 4:57pm

Ok I flew quite a lot this morning, accumulated a dozen or more failsafe events, and I can now confidently state that
a) the failsafe is not related to range; twice I was getting failsafes while sitting on the landing mat, from 15 ft away
b) it correlates with the mode change from 2 to 1. This morning’s evidence suggest that the 2->1 mode change is a necessary but not sufficient condition for failsafe occurence (a failsafe was always associated with a mode change, but it didn’t always failsafe at a mode change).

@rmackay9 I will try to load Copter-4.1.0-beta1 but I’m not sure if I can pull it off. I don’t know how to build it myself. Last time around I tried to install a pre-built and couldn’t make it work with my 1Gb flash Kakute, so someone (@yaapu or even you perhaps…) compiled it for me and provided a link. I’ll try to load the latest auto-built 4.1.0-beta1 today. I really need to learn how to do a custom build myself rather than rely on the auto-built options but I haven’t learned that yet.

I experienced the motor issue again, once, this morning. Re-powering resolved it again. I’ll do my best to get onto 4.1.0-beta1 so we’re on the same page.

andyp1per · April 30, 2021, 6:52pm

Can you disable yaapu telemetry just to eliminate that from the analysis?

Is your RX connected to a DMA enabled serial port?