GSoC 2026 Interest — AI-Assisted Log Diagnosis & Root-Cause Detection

I’m Sathvik, a final-year ECE student with a background in computer vision, Machine learning and embedded systems. I’m interested in the AI-Assisted Log Diagnosis GSoC project.

As preparation, I built a prototype that analyzed a real QuadPlane TRI crash across 5 flight logs using pymavlink and pandas. The tool automatically detected Motor 4 failure as the primary root cause through progressive multi-flight analysis — including battery instability, motor saturation, and a 47° extreme roll event during VTOL transition.

I’d love feedback from mentors on the approach before I finalize my proposal. Code is here: GitHub - Sathvik12004/ardupilot-log-diagnosis: AI-assisted flight log analysis tool for ArduPilot — automatically detects anomalies and generates root-cause diagnosis reports. · GitHub

Thanks!

1 Like

Simply iterating and looking for min/max values while applying simple comparison operations is extremely rudimentary and not likely to exceed the most basic of human log review capabilities. I don’t see where this is an AI-assisted approach at all.

Preventing an accident is better and cheaper than finding out why it happened.

Correctly configuring, tuning and operating the vehicle prevents most mishaps to happen. Why not invest in that instead?

2 Likes

@amilcarlucas, the suggestion for this is in our list of GSOC 2026 topics. But this approach does not seem to be aligned with the intent.

Agreed on the principle of your point… however in practice systems will occasionally not act nominally (or fail), and given the complexity of our systems it can be daunting to diagnose issues unless the person has a depth of experience. Maybe one day we get to zero issue operation of ArduPilot when both config and operation is put on completely automated rails, but for now we still have this huge demand of log reviews and the like for when things go wrong, and interpreting those is a barrier for users.

As I see it, the intent of this GSoC project idea is to lower the bar for diagnosis for those who fail to configure correctly or encounter something in operation that they can describe, but don’t have the necessary experience or background to pinpoint the issue themselves. I still don’t see AI tools right now (as in today) being able to independently diagnose and verify most issues, but what they’re really good at is aiding a user with some technical understanding of the system they’re using to dig around and work up different hypothesis of the issue, and to then tie that back to the logs.

So, what I picture in my own head is scaffolding for that process. At least something that is better than the common approach I see where folks dump a .bin file with a short note and hope their LLM model of choice can do the thinking for them.

Hi Yuri, point taken
I’ve replaced the threshold logic with an Isolation Forest model that trains on multivariate time-series feature vectors extracted from normal flight logs, scoring anomalies via path-length deviation in an ensemble of 100 randomized trees across 11 simultaneous features (VibeX/Y/Z, Volt, Curr, Roll, Pitch, C1–C4),The current parser extracts 11 features across 4 message types, (for GSoC, this will be expanded to 60+ features across all diagnostic message types including EKF variance trends, compass field magnitude, GPS HDOP, and individual ESC telemetry, with graceful handling of firmware version differences and missing message types.) with confidence scores derived from per-feature z-score deviations against the training distribution, no human-defined thresholds anywhere in the pipeline.
Isolation Forest was chosen as the immediate prototype because it is unsupervised, requires no labelled crash data, trains on the five logs currently available, and produces interpretable anomaly scores in seconds on CPU making it the most practical choice for a small dataset while the larger training corpus is being assembled.
For GSoC, I’m proposing to replace Isolation Forest with an LSTM Autoencoder trained on a corpus of normal ArduPilot community logs, using per-channel reconstruction error as the anomaly signal and a causal temporal correlator to determine which subsystem failure preceded others — critical for distinguishing whether EKF divergence caused the crash or was a symptom of it. On top of that, a multi-flight progressive analysis layer tracks metric degradation across sequential logs from the same airframe, catching failures like ESC wear before they become fatal something single-log post-flight analysis misses entirely.
The final interface wraps all of this in an LLM diagnostic layer that converts raw anomaly signals into pilot-actionable language without exposing technical internals. Code is live and updated at GitHub - Sathvik12004/ardupilot-log-diagnosis: AI-assisted flight log analysis tool for ArduPilot — automatically detects anomalies and generates root-cause diagnosis reports. · GitHub.