In my opinion the main issue here is that it is not clear if the current way of the execution is purposefully like that. This makes it harder to predict the behavior on mission execution as it may be a sum of multiple commands in certain cases.
This makes the it hard for 3rd party mission planning software to do abstraction as the small details are so important on certain combination of DO and NAV commands and how they are ordered. I’m hoping that at some point developers could define a “standard” or guide a how missions must be executed by autopilot. For example some kind of simple state machine that would make it easy to replicate and check generated missions on 3rd party planning software.
Well simulation is one way of doing the validation, a bit on the heavy side. But how one could validate the mission without watching the simulation? To include SITL to 3rd party “missionplanner” for validation would need that writing dynamically generated tests are easy.
However I would still see it appropriate to define how mission items needs to be executed.