*If we mmap a lot of files, _exit(2) is not instantaneous but takes a few hundre...

rui314 · on Feb 23, 2021

It is safe because the child process calls munmap before telling its parent process to exit. munmap is guaranteed to act as a commit operation. Alternatively, you can call msync (https://man7.org/linux/man-pages/man2/msync.2.html) if you want to keep it mmapped.

amluto · on Feb 23, 2021

Linux gives much stronger guarantees than POSIX here. I wonder if you save measurable time by skipping munmap.

ncmncm · on Feb 24, 2021

munmap is often a remarkably slow operation, if your process is multi-threaded, because of TLB shootdowns; on each munmap, all the other threads get paused and their page map caches get trashed, each time.

It is usually much better to have multiple regular processes, instead of threads, that only share chosen mappings, if you want to use munmap. Or, you can terminate and join all your threads before you start munmapping.

rui314 · on Feb 23, 2021

Is that documented?

amluto · on Feb 23, 2021

I would have said yes, but I can’t find it. That being said, Linux has a “unified page cache”, and MAP_SHARED is coherent with read(2) and write(2), at least on any local filesystem (not sure about FUSE) and when direct IO is not involved.

That being said, I could easily believe that largeish pwrite(2) calls would be comparably fast compared to mmap, since mmap needs to play with page tables, and page faults on x86 are expensive. MAP_POPULATE would also be worth trying if you’re not already using it.

I assume that copy_file_range(2) is out of the question due to relocations.

rui314 · on Feb 24, 2021

I once counted the number of 4 KiB blocks that has at least one relocation. I used Chrome as a sample. It turned out that almost all 4 KiB blocks have at least one relocation. They mutate everywhere.