Barometer failure

Hi, I met a very severe barameter failure here during my last flight. The firmware is based on copter-3.5.2, and do some changes in attitude control. My board is Pixhawk2, The barometer type is MS5611.
The bin log is uploaded here. You may see the EKF failure occured constantly when the barometer reading became so unreasonable.Alt ranging from -13335to26534, temp ranging from -176to43, press ranging from -8185to348874.
Is there any idea about the reason of this bas failure? How should I do some change to avoid these kind of bad reading?
00000096_BaroFailure.001.zip (512 KB)
00000096_BaroFailure.002.zip (512 KB)
00000096_BaroFailure.003.zip (512 KB)
00000096_BaroFailure.004.zip (512 KB)
00000096_BaroFailure.005.zip (512 KB)
00000096_BaroFailure.006.zip (512 KB)
00000096_BaroFailure.007.zip (432.2 KB)
Sorry about the attachments is apart from an integreted file. When download these files, you may want to rename these files. For example, rename “00000096_BaroFailure.001.zip” to “00000096_BaroFailure.zip.001” . Then you can unzip these 7 files to the origin filght data bin log file.

By the way, the baro failure occured about 20 meters in air. And EKF functioning well and keep good alt value for 30 seconds. Before I landed, in fact just 1 meter above the ground the EKF alt crash into extremely large readings then the later consequences become. Hopefully, the barometer recovery from bad reading and I finally got the vehicle landed in one piece.
I also got a figure uploaded here. It’s plotted using here https://github.com/AnakinQin/intelligent-data-analysis-system in case you want to use it too.

I also upload the file to the onedrive but not sure about whether the download is valid, so I just put here and keep the former link too.
https://buaaeducn-my.sharepoint.com/:u:/g/personal/zonghang_qin_buaa_edu_cn/EaRUv7x2CtpEhUhkpDGnMh8Bx4VyNTrIXw84KAAVkDYFlQ

I do find our system have two barometers. But using the first one usually. The device healthy is only determined by the reading communication timeout. How should I avoid this amazing problem?

According to my analysis, the backend driver do NOT judge the reading value. And using many converts such as"_D1 = ((float)sD1) / d1count;", which convert temp or press from uint32_t to float. If the reading is so big, The convert result indeed can be negative, which is exactly unreasonable. So what should we do now?

Uh,no reply yet. Is there any problem getting the whole 10.2MB flight bin log data? I don’t know how to use google box or other global cloud box due to some unconfortable network reasons. Please kindly help me about this failure. And I also believe this is an issue which our ardupilot may want to concern about.If my analysis is incorrect, please kindly help and talk to me.

You are not likely to get a reply by requiring someone to assemble a zip file for the log. Post it on a cloud service like everyone else does.

Well, I did download the log, and yes, that definitely looks like the baro is unhappy.

However, it’s obviously a modified firmware and without access to any source code all bets are off - for all I know the Baro driver has been heavily modified causing these issues.

It also doesn’t show any of the signature I’m looking for in Baro failures.

Peter

I would try to get some cloud disk to upload my log, Thanks a lot

The reason I modify ardupilot is to do my job under my mentor’s advisor…I concern about the modern control theory application. So I ONLY do changes to the attitude lib. I would blame this failure to the baro sensor failure and in the src, we don’t have any reading check, even lack of range check. Am I right?

we can have no idea if you are right as we can’t see your changes. It is easy to introduce bugs.
Please push a branch of the exact code flown for this log to a repo we can get to.
Also, what do you mean by ‘attitude lib’ ??

Thank you for your reply. I don’t know whether I could public my code cause the paper based on is not yet issued.
I would ask my advisor for this demand. And the “attitude lib”, I mean the “libraries/AC_AttitudeControl/”. I don’t use the PID controller.
By the way, the point I note about the barometer reading convert lies here. For example in master branch(in fact the baro lib didn’t change any more):/libraries/AP_Baro/AP_Baro_MS5611.cpp. Started from line 323 to 328
if (d1count != 0) {
_D1 = ((float)sD1) / d1count;
}
if (d2count != 0) {
_D2 = ((float)sD2) / d2count;
}
What if the baro u32 reading is large enough and just larger than the positive range of float format? I believe it would turn out a negative press and temp value caculated.
Beside, in line 244, The _timer function told us we just trust non-zero reading, which indeed can cause this large number convert problem. I suggest we do some value range limitation and value change rate per timer limitation according to the reality. Do I denote myself clear?

Can you check if this PR benefit you:
Can you apply it and check if your barometer is having PreArm errors?