RCInput_RPI.cpp segfault on Ubuntu, mmap call fails

Hello all,

I have an Erle-brain-2 with a Raspberry pi 3. When I run arducopter on the frambuesa Raspbian, it works fine. But when I run it on Ubuntu 18, it ends up in a segmentation fault. (I prefer Ubuntu 18 for running SLAM packages and development).
The segfault happens in the RCInput_RPI.cpp, in the Memory_table::Memory_table, when mmap is getting called.
I downloaded ardupilot and cross-compiled it, and used the same binary on both raspbian and ubuntu to compare them. (I tried master branch, and also checked out 3.2)

I have mentioned my findings here, anyone any idea why the physical mapping is not working on Ubuntu? What can I do about it?

Thank you to all the contributors.


Here is the gdb result when I run it on Ubuntu:

(gdb) run
Starting program: /home/erle/arducopter 
Cannot parse expression `.L1207 4@r4'.
warning: Probes-based dynamic linker interface failed.
Reverting to original interface.

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".
Raspberry Pi 3 Model B Rev 1.2. (intern: 2)
[New Thread 0x76cb0450 (LWP 19469)]
[New Thread 0x769ff450 (LWP 19470)]
[New Thread 0x766ff450 (LWP 19471)]
[New Thread 0x763fe450 (LWP 19472)]

Thread 1 "arducopter" received signal SIGSEGV, Segmentation fault.
memset () at ../sysdeps/arm/memset.S:35
35	../sysdeps/arm/memset.S: No such file or directory.
(gdb) bt
#0  memset () at ../sysdeps/arm/memset.S:35
#1  0x000a8b92 in Linux::Memory_table::Memory_table(unsigned int, int) ()
#2  0x000a9228 in Linux::RCInput_RPI::init() ()
#3  0x000a74b8 in HAL_Linux::run(int, char* const*, AP_HAL::HAL::Callbacks*) const ()
#4  0x00013450 in main ()
(gdb) 

I modified the Physical memory mapping code as follow, just adding printf() lines to check the result of the mmap line:

159     // Map physical addresses to virtual memory
160     for (i = 0; i < _page_count; i++) {
161         munmap(_virt_pages[i], PAGE_SIZE);
162         printf("_virt_page[%d] before: %#016llX\n", i, (uint64_t) _virt_pages[i]);
163         _virt_pages[i] = mmap(_virt_pages[i], PAGE_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_FIXED | MAP_NORESERVE | MAP_LOCKED, fdMem, ((uintptr_t)_phys_pages[i] & (version == 1 ? 0xFFFFFFFF : ~
164         printf("_virt_page[%d] after : %#016llX\n", i, (uint64_t) _virt_pages[i]);
165         printf("ERROR --- errno: %d\n", errno);
166         memset(_virt_pages[i], 0xee, PAGE_SIZE);
167     }   

On the Raspbian image, it runs fine and _virt_pages don’t change and there is no error:

Raspberry Pi 3 Model B Rev 1.2. (intern: 2)
_virt_page[0] before: 0X00000076FEF000
_virt_page[0] after : 0X00000076FEF000
ERROR --- errno: 0
_virt_page[1] before: 0X00000076FEE000
_virt_page[1] after : 0X00000076FEE000
ERROR --- errno: 0
_virt_page[2] before: 0X00000076FED000
_virt_page[2] after : 0X00000076FED000
ERROR --- errno: 0
.
.
.
.
_virt_page[59] before: 0X00000076B33000
_virt_page[59] after : 0X00000076B33000
ERROR --- errno: 0
MS5611 found on bus 0 address 0x00
MPU: temp reset IMU[0] 7347 0

But when I run the same on the Ubuntu image, it results in errno = 1, and failing:

Raspberry Pi 3 Model B Rev 1.2. (intern: 2)
_virt_page[0] before: 0X00000076FE7000
_virt_page[0] after : 0XFFFFFFFFFFFFFFFF
ERROR --- errno: 1
Segmentation fault

When I comment the memset call within the loop (line 163 of original code), it runs the loop, but segfaults after that (no surprise!)

Raspberry Pi 3 Model B Rev 1.2. (intern: 2)
_virt_page[0] before: 0X00000076FE9000
_virt_page[0] after : 0XFFFFFFFFFFFFFFFF
ERROR --- errno: 1
_virt_page[1] before: 0X00000076FE8000
_virt_page[1] after : 0XFFFFFFFFFFFFFFFF
ERROR --- errno: 1
_virt_page[2] before: 0X00000076FE7000
_virt_page[2] after : 0XFFFFFFFFFFFFFFFF
ERROR --- errno: 1
.
.
.
_virt_page[59] before: 0X00000076B6C000
_virt_page[59] after : 0XFFFFFFFFFFFFFFFF
ERROR --- errno: 1
Interrupted: Segmentation fault

Hi @f_gh
its hard to help without having same hardware to test and debug
but if code running fine on frambuesa Raspbian and doesn’t run on Ubuntu 18 i think its a library mismatch
because frambuesa Raspbian based on Debian Jessie and Ubuntu 18 based on Debian Buster ,
you can record a full trace log on both frambuesa Raspbian and Ubuntu 18 and then compare the logs to find in witch library code fails and revert it to older version by apt package manager in Ubuntu

Thanks for the response.
How do you suggest I do that? So it might be in the sys/mman.h?

GDB has native support for instruction tracing, via the record command. Using the command record full will record all changes to the process’s state, and even allow reverse debugging (i.e. backward-stepping and replay).

You can find more information here:https://sourceware.org/gdb/onlinedocs/gdb/Process-Record-and-Replay.html

Thanks for the suggestion. When I use GDB with record, it throws me another issue:
Here is the log:

sudo gdb arducopter
[sudo] password for erle: 
GNU gdb (Ubuntu 8.1-0ubuntu3) 8.1.0.20180409-git
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "arm-linux-gnueabihf".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from arducopter...(no debugging symbols found)...done.
(gdb) record full
Process record: the program is not being run.
(gdb) start -A udp:"127.0.0.1:6001" -B /dev/ttyAMA0 -C /dev/ttyUSB0 -l /home/erle/APM/logs -t /home/erle/APM/terrain/
Temporary breakpoint 1 at 0x1344c
Starting program: /home/erle/arducopter -A udp:"127.0.0.1:6001" -B /dev/ttyAMA0 -C /dev/ttyUSB0 -l /home/erle/APM/logs -t /home/erle/APM/terrain/
Cannot parse expression `.L1207 4@r4'.
warning: Probes-based dynamic linker interface failed.
Reverting to original interface.

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".
Raspberry Pi 3 Model B Rev 1.2. (intern: 2)

Temporary breakpoint 1, 0x0001344c in main ()
(gdb) record full
(gdb) c
Continuing.
[New Thread 0x76ca2450 (LWP 11109)]
Process record does not support instruction 0xf890f000 at address 0x76cfb0c0.
Process record: failed to record execution log.
/build/gdb-DVuIO7/gdb-8.1/gdb/record-full.c:1048: internal-error: ptid_t record_full_wait_1(target_ops*, ptid_t, target_waitstatus*, int): Assertion `(options & TARGET_WNOHANG) != 0' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n) n

this is a problem of GDB i think its better to post it on stackoverflow