I’ve been using lua scripts for ages, without any problems, but lately, I’m getting some really weird and kind of random error messages.
My lua scripts are in the @ROMFS/scripts/ directory, so I’m building the scripts into the firmware. Everything works fine in SITL, but as soon as I try to upload them to my CubeOrangePlus, sometimes I get random errors. Sometimes they are referring to syntax errors in specific lines (like “204: syntax error near ‘local’” - but there is no “local” keyword anywhere near that line), sometimes they are showing syntax errors in a couple of different scripts, then after a reboot the error changes, but is still weird, because it references errors like functions are not closed, or that the script has more than 100 local variables, which is not possible, since there are way less variables in that script.
It looks almost as the scripts get “merged” into each other, or they cut out in the middle of the script (that would explain why some functions are missing the “end” keyword, although it is clearly in the script - and again, it works in SITL).
I’m out of things to debug. What could cause things like these? Could it be file encoding? Line endings? Weird hidden characters? I tried to rule out these, but maybe someone had a similar issue.
I personally suspect power off during write to be the source of issues especially on lower end cards though I have no sources for that. I mostly observed issues with logs though I recall at least one case of corrupted files in scripts folder that ended with non ASCII characters in file names.
we have a bunch of CubeOrangePlus boards and the funny thing is that this error occurs on just a small percentage of them. and even so, it seems random, like on one cube it’s every time, on the other cube it runs fine for a dozen reboots, then suddenly it appears. or it works just fine while setup and tuning, then i put the finished drone into storage, then two weeks later it won’t arm because of scripting errors. i can only repeat myself, it is weird.
however, yesterday i revisited the problem, this time changing SCR_THD_PRIORITY to 2 and it seems to have solved the problem.
if i would have to guess, some systems try to load lua scripts while the storage is not ready to be read, or it is doing something else, or something interrupts the reading process. but i have no idea to be honest. i’m planning on inspecting the Ardupilot scheduler, maybe i can find some clues.
After some testing, we think we found some correlation with having TERRAIN_ENABLE =1. Our theory right now is we’re running out of memory when it’s pulling terrain data and running our script at the same time. So far we haven’t had it happen with terrain off, but we’ve only tested a few times so far. You can also try deleting the terrain files from the SD card and reboot and watch the errors happen again. Our suspicion is the terrain logic is overstepping it’s allocation and we’re running into memory issues.
Interesting find! That being said, this is not the case for me. I tried to set TERRAIN_ENABLE to 0 and also deleted the terrain data from the SD card, and the errors are still there.
Have you tried setting SCR_THD_PRIORITY by chance? That always fixes it for me.
Turns out we still had the issue with terrain off. Although it was a lot less frequent so I still think it’s related somehow with resource management. Turning up the thread priority seems to be a more reliable fix. I don’t know enough about the tradeoffs of increasing that priority, so hopefully it’s not a big deal!