I read through the entire article. Very fascinating. I admit that I don't understand but more than 5% of what is being explained here, but it is a VERY in depth look at how things are mapped out. I wasn't aware that operating systems were so wasteful in the area of redundant resource management. Mad props on this article.
No wasteful redundant resource management. It might look like that, but for every data structure in the kernel there's a use-case that absolutely needs the corresponding lookup/access to go fast. A bunch of examples from memory management:
* You need your pagetables for the cpu to find your memory.
* You need your vma tree in each struct mm to quickly go from virtual address to the backing store to handle missing pagetable-entries. People routinely allocate insane amounts of virtual memory and just use a few bytes, so prefilling all the pagetables is a no-go.
* You need the struct page for every physical page to the backing store (file or anonymous memory) and keep track of it's state (e.g. locked into place due to i/o, dirty and needs to eventually be written back to the filesystem).
* You need the radix pagecache tree to quickly go from file-offset to it's cache page (read/write and page fault handling). Also you need to walk that tree in-order to write back dirty pages efficiently (i.e. avoid tons of random seeks).
* All the connections are used, too. E.g. when shrinking the size of a file, all the vma's that map it are looked up via the file's address_space to shoot down these mappings (and deliver SIGBUS in case userspace access these now invalid virtual address).
* Anonymous memory (i.e. not memory mappings of a file with the pages in the file page cache) also neds to be tracked in a similar way. It's sligthly different because anonymous memory has a few different corner-cases with Fancy Behaviour (tm) like the shoot-down-all-mappings-on-truncate dance necessary for file-backed memory.
And often-used data structures are insanely optimized: Fields that are not used at the same time overlap (struct page is the most obvious example). And because pointers are always aligned to 4 bytes, there are 3 bits available in each that are used to store some information or the type of the object pointed to (e.g. the color of the red-black tree implementation is stored like this). Further structures are as often as possible embedded into each another to avoid the cache-miss when chasing a pointer.
Because it's all extremely optimized, it's easy to assume all memory access is O(1). It's not, and every time someone blurbs that his hash-table has O(1) access times my brain immediately goes "BS! ... not on any real hardware and not on any real kernel." Of course, you need pretty big hash-tables to notice the effect ...