Nice. But the beauty of GP's approach is that you don't need to issue another open syscall to open argv[0] in order to access (read or mmap) your code but instead just piggy back on what the kernel just did for your entry point anyway.
Except for the beauty of the approach, I'm not sure what the practical advantages are. Are there cases where a process wouldn't have permissions to access its own executable or argv[0] value is unreliable? Can you exec on a file descriptor of a deleted file?
EDIT: or would /proc/self/exe always point to something the process could open?
I believe the process always holds the binary open as a "txt" file descriptor (check lsof), so opening `/proc/self/exe` always works.
You can even execve a "memfd", an in-memory "file" which is not in the filesystem (distinct from a ramdisk file, which is a file sitting on an in-memory filesystem). /proc/self/exe still works even in that case, even when the original memfd is closed.
Note that argv[0] can never be relied on 100%. The vast majority of programs will set it correctly (especially since many programs will malfunction if provided bogus argv[0]), but a caller has full control over argv[0] and can set it to anything, including a NULL pointer (by simply passing an empty argv array).
> I believe the process always holds the binary open as a "txt" file descriptor
To be clear, the kernel has an association between the process and the "txt" file (because it is mmaped in), but this is not an application file descriptor (like 0, 1, 2, ...). If an application wants to read from it, and it isn't already mapped by a LOAD section, it needs to open() a real file descriptor.
Absolutely. It's certainly not mounted automatically after Linux boots and depending on the system's configuration it might never get mounted at all. Maybe it could even use some other path.
One of my long term goals with the programming language I posted is to boot Linux directly into the interpreter and bring up the entire system from inside it. Not only will /proc not be mounted, my program's gonna be the one that mounts it. So I decided to avoid using tricks like reading /proc/self/exe.
> Are there cases where a process wouldn't have permissions to access its own executable
Yes. Permissions might have changed after execution has begun. The file might even have been removed. This creates a race condition.
> argv[0] value is unreliable?
It is. The program calling execve has complete control over the arguments and environment of the program being spawned. It could set argv[0] to anything, including the null pointer or the empty string.
Last year I sent a patch to GNU coreutils that would let env set the argv[0] of programs. My purpose was to use env to test this exact edge case.
> the symbolic link will contain the string ' (deleted)'
> appended to the original pathname.
It's not 100% clear to me if opening and reading the executable will still succeed in that case. I assume it wouldn't work because the manual says it's just a symbolic link to the executable which will become a dangling link if the file it points to is deleted.
There's more: permissions to read the link can be revoked, the link is invalidated if the main thread ever exits, it has a completely different format in old Linux versions...
The ELF segment approach just ignores everything in this comment by getting Linux to mmap the data in just like the program text and data sections. The data will be ready before the program even runs.
Except for the beauty of the approach, I'm not sure what the practical advantages are. Are there cases where a process wouldn't have permissions to access its own executable or argv[0] value is unreliable? Can you exec on a file descriptor of a deleted file?
EDIT: or would /proc/self/exe always point to something the process could open?