Our 30K Drone crashed, and we have no idea why, but we got the logs, maybe somebody can help?

Hi Everyone,

first of all, I have been reading this wonderful forum for years, and I learned a lot from the community already, thanks!
The reason I created this topic, is that I run out of options, since reading log files is new to me it would be very nice, if somebody could help me understand what happened with our agricultural spraying drone.
We were flying in auto mode and suddenly the drone crashed, I have been trying to understand the log files, but the only info i was able to understand, is that the EKF had some kind of issue, so it switched to IMU2 right before the crash. Can anyone please help to understand?
here is a link to the log file:

thanks in advance
Karoly

There was an odd glitch in the Motor Outputs dropping them to 0 briefly. When it recovered from this the motors on the right side of the craft were commanded to max or close to it. The motors on the left side were dropped to stabilize the craft and down it went. It would have been interesting to look at battery current at this time but it’s not logged. Does what you witnessed make sense form this observation?

Hi Dave,

thanks for your answer, your explanation is perfectly accurate. I also wanted to check the battery current, Im not sor why it is not logged. Do you have any idea why this happened? We had an error message: EKF PRIMARY: OCCURED OR FAILED TO INITILIASE.
Also, can you tell me which log reader do yo use?

Hi KĂĄroly- From the timing in the log it looks like the EKF error was generated after it lost stabilization or when it crashed. Check out the attached error messages with respect to altitude. These graph are generated from APM Planner 2.

I have to admit I have never seen an RC Output go to zero. To minimum PWM yes, but I’m not sure what 0 means frankly. Logging didn’t stop because there is other data being recorded during this event. I have looked at other logged data that might correlate but I’m not seeing anything.

Looks like your safety switch engaged mid-air.

However, this is rather troubling as I thought we’d fixed the safety switch issues earlier in the 3.6 series. It was found that some safety switches were activating in-flight - particularly when water was involved.

Bit 2 in https://ardupilot.org/plane/docs/parameters.html#brd-safetyoption-options-for-safety-button-behavior is supposed to stop the switch activating in-flight - but the “activewhenarmed” bit is not set in your log.

As @dkemxr observed, your motor outputs momentarily dropped to zero. That event corresponds to the safety switch engaging:

While the motor outputs did go back up again, often mid-air restart on larger vehicles don’t work out - ESC sync issues.

This looks like a software fault. @rmackay9 may be able to say it was fixed between 3.6.10 and 3.6.12 - I can’t ATM, sorry.

Please note that your are running 3.6.10. There are critical bugs fixed in 3.6.11 and 3.6.12.

2 Likes

The parameter “ safety button” what is the default Value when using the latest firmware? Also, “ activewhenarmed” thanks

Besides the basic pre arm checks, and other essential settings, is there a cheat sheet out there that one can check against critical parameter settings besides the know how one may have.

@Karoly_Ludvigh do you have a tlog available for this flight?

There are a few explanations I can think of:

  • if the IOMCU reset then it would have come up with the safety enabled. We don’t have IOMCU logging in this old version of the code, so we can’t prove if that happened
  • if the GCS sent a MAVLink message to request safety enable. That bypasses the BRD_SAFETY parameters. We need the tlog to see if that happened.
  • if we have (or had in 3.6.10) a bug related to safety button handling

Can you please dig out the tlog from the GCS?

@UAVSkies there are several parameters which affect the safety switch.

The default for the BRD_SAFETYOPTION switch is to allow the switch to enable the safety and disable the safety - but NOT when the vehicle is armed.

AFAIK there’s no cheatsheet on what to check. However, one of the first things we do when checking logs is check the ARMING_CHECK bitmask. If it is zero we start to frown.

A suitably motivated person could probably find a way to crash a vehicle with most of our parameters :slight_smile:

2 Likes

Thanks for the answers, here is the link for the tlog. Can you tell me how water affects the safety? Im asking because there was some water involved, since we were spraying.

I don’t think that is the right tlog, the motors never run in that tlog

You are right, here is the right tlog.
And also, we are using HereLink.

thanks, I can’t see anything in there that indicates a mavlink commanded safety state change.
I’ve also been working to test if an IOMCU reset can cause an issue with master, and it can. We had code to try to cope with it, but testing it this evening shows it wasn’t reliable.
I’ve opened a PR here to fix it:

Thanks. I will start a new thread as I don’t’ want to de rail this conversation and I will invite you. I have few more questions. Thanks

Hi, thanks for your answer, but dont really understand the problem, why would the IOMCU reset? And what issue can it cause with the master (also what master means in this case?)? And can you explain to me the PR you opened? We are quiet new to this, and Im not sure what should we do to avoid this problem next time.

Also @peterbarker, If i understand it correctly, your solution is a bit different, can you tell me anything based on the tlog? Or if water has anything to do with it?

thanks again!

1 Like

I think it has something with water, I was looking for correlating anomalies, and it seems that right before the safety switch engaged, the VCC power was dipped (not much, but it got all times low, during the flight). Is it possible that the water/spray got to the flight controller or cabling ?

Hi, thanks for your answer, and yes, it is absolutely possible, since we were testing a new spraying system, but It should not be problem, since we are using a drone made for spraying, and also moderate rain is ok, based on the factory guarantee policy. But if it was the water, is there any software solution? Im asking, because @tridge and @peterbarker also thinks it was a software issue, if i understand correctly.

If it is from China, then “rain is ok, based on the factory guarantee policy.” is basically zero guarantee. Unless you have conformal coating on key elements, and Silicone sealing on non movable parts… I see and serviced a couple of spraying drones from china (Joyance, TTA, ect…) and all of them were crap :frowning:

Hi, no its not from China, we are flying european made drones.

The IOMCU would reset either if their was a bug in the IOMCU firmware or if there was a transient power spike which caused it to reset. It is very rare (I think this is the first time an aircraft has been lost where we think it might be an IOMCU reset) so hard to really attribute causes. We certainly have fixed bugs in the IOMCU firmware since 3.6.10.
By ‘master’ I mean the latest development branch of ArduPilot, which will become the 4.1.x release in the future. That is called ‘master’ in the source code control system we use.
The PR I opened fixes the FMU firmware to automatically disable the safety under the following conditions:

  • it detects an IOMCU reset (as time went backwards in the status from the IOMCU)
  • it knows that safety was disabled in the last 20Hz update of status from the IOMCU
  • the vehicle is armed
    If those conditions are true then it forces the safety off to give you a chance to continue flying.

Sorry for the confusing terminology. Basically I think you hit a very rare condition. I can’t be sure if the issue is a fault in your hardware or a software bug, but I do know that ArduPilot could have done better in handling the condition, which is why I opened that PR, so if this does happen to someone else in the future it will force the safety off and should keep flying.
Cheers, Tridge

3 Likes