Hacker News new | past | comments | ask | show | jobs | submit login
A prog by any other name (tedunangst.com)
64 points by ingve on April 28, 2016 | hide | past | favorite | 17 comments



Sometimes this is used for other purposes, as well. For instance, the busybox binary provides a lot of utilities (cp, init, ping etc) all linked to /bin/busybox, which uses the name it is called under to determine which program to run.


The GNU mtools also used this scheme. So the idea goes back pretty far.

(Mtools is a collection of tools to allow Unix systems to manipulate MS-DOS files: read, write, and move around files on an MS-DOS filesystem, typically a floppy disk.)


egrep, fgrep, grep (+ z* variants) share the same binary too.


These are all deprecated by the way. I just alias them now since egrep is muscle memory and hard to break.


We’re calling rm, but we’re calling it cp. Fortunately, cp is not one of the programs that cares what we call it.

Something is contradictory here. Perhaps rm is such a program?


> we're calling it cp

;)


Why does `cp`s behavior here under discussion? `cp` is not being invoked or executed.


> For example, grep and egrep are two commands that perform very similar functions and are therefore implemented as a single program.

Not everywhere:

    $ ls -li /bin/grep /bin/egrep /bin/fgrep
    37748784 -rwxr-xr-x 1 root root 183696 Jan 18  2014 /bin/egrep
    37748788 -rwxr-xr-x 1 root root 138352 Jan 18  2014 /bin/fgrep
    37748794 -rwxr-xr-x 1 root root 191952 Jan 18  2014 /bin/grep
(That's on Ubuntu 14.04.)


That may just be because they are shell scripts instead of symlinks. On ArchLinux:

    $ ls -li /usr/bin/{grep,egrep,fgrep}
    1728040 -rwxr-xr-x 1 root root     28 Apr 23 12:12 /usr/bin/egrep*
    1728041 -rwxr-xr-x 1 root root     28 Apr 23 12:12 /usr/bin/fgrep*
    1728042 -rwxr-xr-x 1 root root 159024 Apr 23 12:12 /usr/bin/grep*

    $ cat /usr/bin/egrep
    #!/bin/sh
    exec grep -E "$@"

    $ cat /usr/bin/fgrep
    #!/bin/sh
    exec grep -F "$@"
The general point of them being implemented by the same executable still stands.


No, on my system they're three distinct executables; check the sizes shown in my previous comment.

Ubuntu 14.04 uses GNU grep 2.16. When I install it from source (grep-2.16.tar.xz), I get the three distinct executables. When I install the latest version (2.25) from source, egrep and fgrep are shell script that invoke "grep". (Which could be a problem if you happen to have another "grep" command in an earlier directory in your $PATH. Symlinks would avoid that problem.)

Here's the Changelog entry for the change to shell scripts:

    2014-03-23  Paul Eggert  <eggert@cs.ucla.edu>

        egrep, fgrep: go back to shell scripts
        Although egrep's and fgrep's switch from shell scripts to
        executables may have made sense in 2005, it complicated
        maintenance and recently has caused subtle performance bugs.
        Go back to the old way of doing things, as it's simpler and more
        easily separated from the mainstream implementation.  This should
        be good enough nowadays, as POSIX has withdrawn egrep/fgrep and
        portable applications should be using -E/-F anyway.


Just as another data point, on my system (fully up to date arch) it's half and half:

  $ file -i /usr/bin/*grep|sort -k2,2                                                                 [17:59:42] 
  /usr/bin/grep:     application/x-executable; charset=binary
  /usr/bin/igrep:    application/x-executable; charset=binary
  /usr/bin/pgrep:    application/x-executable; charset=binary
  /usr/bin/msggrep:  application/x-executable; charset=binary
  /usr/bin/deepgrep: application/x-executable; charset=binary
  /usr/bin/pcregrep: application/x-executable; charset=binary
  /usr/bin/lzgrep:   inode/symlink; charset=binary
  /usr/bin/lzegrep:  inode/symlink; charset=binary
  /usr/bin/lzfgrep:  inode/symlink; charset=binary
  /usr/bin/xzegrep:  inode/symlink; charset=binary
  /usr/bin/xzfgrep:  inode/symlink; charset=binary
  /usr/bin/egrep:    text/x-shellscript; charset=us-ascii
  /usr/bin/fgrep:    text/x-shellscript; charset=us-ascii
  /usr/bin/zgrep:    text/x-shellscript; charset=us-ascii
  /usr/bin/bzgrep:   text/x-shellscript; charset=us-ascii
  /usr/bin/xzgrep:   text/x-shellscript; charset=us-ascii
  /usr/bin/zegrep:   text/x-shellscript; charset=us-ascii
  /usr/bin/zfgrep:   text/x-shellscript; charset=us-ascii
  /usr/bin/zipgrep:  text/x-shellscript; charset=us-ascii


He's an openbsd developer, he's posting about openbsd. Hence the talk of how things got to be into older BSD systems and made their way to openbsd.


That's a stretch to call it "related", but this one reminded me of another story related to file names in a surprising way - "The $5000 Compression Challenge": https://news.ycombinator.com/item?id=9163782


> There is also a setter function, setprogname, which is not to be confused with the slightly different setproctitle. There’s no getter for proctitle, however, unless you count ps.

How's this possible? `ps` has to work somehow.


setproctitle() is only for the current process. It is a system-specific interface implemented in libc (in OpenBSD/FreeBSD a sysctl() to set the memory address where the string is stored).

ps(1) examines other processes and again uses a system-specific interface to the kernel to retrieve all information about other processes (using kvm(3) on OpenBSD/FreeBSD).


I see. So you could use the same kvm API to get all the proctitles (assuming you're on a bsd), and then filter it yourself to find the one for the current process, right?


Classically, tools like ps worked by opening /dev/kmem, reading the symbol list from /vmunix or wherever, and chasing data structures in kernel memory a bit like a debugger might.

These days there is usually a getter for proctitle in the form of the /proc filesystem, sysctls, etc.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: